Re: Fix git-svn for SVN 1.7

2012-07-31 Thread Junio C Hamano
Eric Wong normalper...@yhbt.net writes:

 Perhaps we can depend on the URI.pm module?  It seems to be
 widely-available and not be a significant barrier to installation.  On
 the other hand, I don't know its history, either (especially since we're
 now dealing with SVN changes...).

 Anyways, I don't like relying on operator overloading, it makes code
 harder to read and review.

I think code that uses operator overloading, when printed in a
textbook, cast in stone and makes the reader aware that it is never
going to change, is indeed easy to read through.  But I suspect
that it may be merely giving a false illusion that it is easy to
readers.

The problem is that use of such obscure overloading tends to hurt
maintainability. If the initial version Michael produces converts
all the external strings into instances of CanonicalizedPath class,
according to the convert as early as possible principle, you can
be assured that all eq you see are about the normalized strings
the svn library wants to see, and that may allow us sleep safely.

But the real problem begins six months down the road, when somebody
wants to add a new codepath that reads a new string from an external
source (e.g. perhaps you add a new configuration variable that
specifies a path in the svn repository and does something special
when that path is touched by a revision; the exact nature of the new
feature does not matter in this discussion).  The new code can
forget to follow the convert early principle, and pass a bare
string read from the configuration around.

A comparison between such a new string and another variable that
holds path that comes from the existing codepath (i.e. Michael's
initial code that perfectly follows the convert early principle)
will still use the overloaded eq in $new_str eq $old_path, thanks
to the language rule of Perl (namely, even though the new string is
a non object, the other side is still an instance of the class).

When the code needs to compare two or more such new strings (e.g.
perhaps it wants to remove duplicates from the set of paths it reads
from the configuration), however, eq silently turns back to a
simple string comparison, as $new_1 eq $new_2 will not magically
turn into Canonicalize($new_1)-cmp(Canonicalize($new_2)).

This kind of error is unnecessarily hard to catch mostly because the
previous $new_str eq $old_path does work; it masks the problem.
Overloading of eq is making it harder to spot new bugs.

If the code never uses eq to compare canonicalized paths, and all
the surrounding code compare paths using explicit method call on
objects, it makes it crystal clear to the readers that paths held in
a bare string is unwelcome in the codepath.  It makes it harder to
add new code that uses and passes around a bare string by mistake to
such a codepath, I would think.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: inconsistent logs when displayed on screen / piped to a file

2012-07-31 Thread Jan Engelhardt
On Monday 2012-07-30 16:58, Mojca Miklavec wrote:

 COLUMNS=YourNumber git log YourArgs  YourFile

Wow, perfect, thank you very much. Setting COLUMNS=200 (the high
number just in case) solved the problem.

200 ought to be enough for everybody? PATH_MAX is never enough...
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] send-email: improve RFC2047 quote parsing

2012-07-31 Thread Thomas Rast
Junio C Hamano gits...@pobox.com writes:

 Thomas Rast tr...@student.ethz.ch writes:

 This patch fixes the first two by doing a more careful decoding of the
 =AB outer quoting.  Fixing the fundamental issues is left for a
 future, more intrusive, patch.

 What is this =AB thing?

The two-hex-digits quoting in the style of MIME quoted-printable.  I
called it the outer quoting (RFC2047: encoding) because it serves to
protect the bytes from transport damage; there is another encoding
(RFC2047: character set) inside which is specified by the
=?utf-8?...?= wrapper.

BTW, note that we also only handle the Q outer quoting
(quoted-printable).  There is a B encoding, which your email in fact
used in the Cc: header:

Cc: [...] =?utf-8?B?SsO8cmdlbiBSw7xobGU=?= j...@online.de

B means Base64, as you can probably guess from the looks of it.

 This is the easy part, fixed as per Junio's comment that it needs to
 use a .*? match for the contents, and with a test.

 What's the hard part?  Do you mean the fundamentally cannot part?

Yes, and by fundamentally I meant not without fixing something
outside of this function, which I am too lazy to do at this time. ;-)

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] l10n: de.po: translate 4 new messages

2012-07-31 Thread Jan Engelhardt

On Monday 2012-07-30 18:08, Ralf Thielow wrote:

Translate 4 new messages came from git.pot update in 0bbe5b4
(l10n: Update git.pot (4 new, 3 removed messages)).

Signed-off-by: Ralf Thielow ralf.thie...@gmail.com
---
Hi German l10n team,

please review this small update on German
translation.

Patch is fine from a translation POV;
but I wonder where my contributions had gone.
Ævar, were they ever merged?

commit 0c3db7e983a58f53cbd468e11937750e155de179
Author: Jan Engelhardt jeng...@medozas.de
Date:   Thu Oct 7 20:52:26 2010 +0200

po/de.po: complete German translation

Translate all 689 currently translatable messages in Git into
German. Making the German translation 100% complete.

[Ævar Arnfjörð Bjarmason: Modified by running msgmerge(1) on it to
normalize the line wrapping, and squashed two of Jan's commits
together]

Signed-off-by: Jan Engelhardt jeng...@medozas.de
Signed-off-by: Ævar Arnfjörð Bjarmason ava...@gmail.com

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] l10n: de.po: translate 4 new messages

2012-07-31 Thread Ralf Thielow
On Tue, Jul 31, 2012 at 10:13 AM, Jan Engelhardt jeng...@inai.de wrote:

 On Monday 2012-07-30 18:08, Ralf Thielow wrote:

Translate 4 new messages came from git.pot update in 0bbe5b4
(l10n: Update git.pot (4 new, 3 removed messages)).

Signed-off-by: Ralf Thielow ralf.thie...@gmail.com
---
Hi German l10n team,

please review this small update on German
translation.

 Patch is fine from a translation POV;
 but I wonder where my contributions had gone.
 Ævar, were they ever merged?

 commit 0c3db7e983a58f53cbd468e11937750e155de179
 Author: Jan Engelhardt jeng...@medozas.de
 Date:   Thu Oct 7 20:52:26 2010 +0200

 po/de.po: complete German translation

 Translate all 689 currently translatable messages in Git into
 German. Making the German translation 100% complete.

 [Ævar Arnfjörð Bjarmason: Modified by running msgmerge(1) on it to
 normalize the line wrapping, and squashed two of Jan's commits
 together]

 Signed-off-by: Jan Engelhardt jeng...@medozas.de
 Signed-off-by: Ævar Arnfjörð Bjarmason ava...@gmail.com


I didn't notice that you made a contribution to the German
translation. As described
in po/README, we have a dedicated repository on GitHub [1] which is a
fork of the
git-po repo. If you want to contribute you can fork this repo and send
a pull request,
or send a patch to the ML. Please read po/README for more informations.

[1] https://github.com/ralfth/git-po-de
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT smart http vs GIT over ssh

2012-07-31 Thread Michael J Gruber
vishwajeet singh venit, vidit, dixit 31.07.2012 05:19:
 On Tue, Jul 31, 2012 at 8:40 AM, Konstantin Khomoutov
 kostix+...@007spb.ru wrote:
 On Tue, Jul 31, 2012 at 08:36:07AM +0530, vishwajeet singh wrote:

 Just wanted to know the difference between smart http and ssh and in
 what scenarios we need them
 I am setting up a git server,  can I just do with smart http support
 or I need to enable the ssh support to use git effectively.
 As I understand github provides both the protocols, what's the reason
 for supporting both protocols.
 http://git-scm.com/book/en/Git-on-the-Server-The-Protocols
 http://git-scm.com/2010/03/04/smart-http.html

 
 Thanks for the links, I have already gone through those links, was
 looking from implementation perspective do I really need to support
 both protocols on my server or I can just do with smart http and
 what's the preferred way of doing it smart http or ssh
 
 

You need to provide what your users demand ;)

Seriously, this is why GitHub and other providers offer both. Not only
are some users more comfortable with one protocol or the other (Win
users don't prefer ssh generally) but some might be able to use only one
because of firewalls or corporate rules.

From the server perspective, the setup is completely different, of
course. Do you have shell accounts already which you want to reuse for
ssh+git? Do you prefer setting up a special purpose shell account
(gitosis/gitolite) or setting up a web server with authentication?

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


rename detection

2012-07-31 Thread Gerlando Falauto

Hi everyone,

I have some questions about rename detection.
The way I understand it, renames are not tracked in any way by GIT, at 
least not in the repository. Instead some detection algorithm is 
executed when data is extracted from the repository, prior to being 
presented to the user (i.e., git format-patch, git log, git show 
etc...), therefore depending on the command line and client used.


Some of those mechanisms are also in place when stuff gets commited. For 
instance, I get some rename indications when editing the commit message, 
and as the output of git commit itself.

I would assume the mechanisms would be exactly the same.

Things get a bit more complicated when you actually merge files (not 
in the git-tish sense, I mean physically move the content of one file 
into the other). Here is a test which includes a bigger file within a 
smaller file.


=== SCRIPT BEGIN 
git --version
git init
hexdump -C -n 5120 /dev/urandom  file2
hexdump -C -n 2560 /dev/urandom  file1
echo --
git add file1 file2
git commit -m first commit
git log --summary -M10% -C -1
echo --
(head -n 40 file1; cat file2; tail -n+40 file1)  file3
git rm file1 file2
git add file3
git commit -m including file2 within file1 as file3
git log --summary -M10% -C -1
echo --
git mv file3 file1
git commit --amend -m including file2 within file1 as file1
git log --summary -M10% -C -1
echo --
git mv file1 file2
git commit --amend -m including file2 within file1 as file2
git log --summary -M10% -C -1
echo --

=== SCRIPT END 

=== OUTPUT BEGIN 
git version 1.7.10.4
Initialized empty Git repository in /home/chfalag1/tmp/gittest/.git/
--
[master (root-commit) 6d997f5] first commit
 2 files changed, 482 insertions(+)
 create mode 100644 file1
 create mode 100644 file2
commit 6d997f5bbed2e9452317991ca024a5a0e1715027
Author: Gerlando Falauto gerlando.fala...@keymile.com
Date:   Tue Jul 31 10:27:42 2012 +0200

first commit

 create mode 100644 file1
 create mode 100644 file2
--
rm 'file1'
rm 'file2'
[master 424edab] including file2 within file1 as file3
 2 files changed, 162 insertions(+), 161 deletions(-)
 delete mode 100644 file1
 rename file2 = file3 (66%)
commit 424edab771495fc3a1b4c172b9fcef9418501266
Author: Gerlando Falauto gerlando.fala...@keymile.com
Date:   Tue Jul 31 10:27:42 2012 +0200

including file2 within file1 as file3

 delete mode 100644 file1
 rename file2 = file3 (66%)
--
[master ca70367] including file2 within file1 as file1
 1 file changed, 162 insertions(+)
 rename file2 = file1 (66%)
commit ca7036705063adbbd3c8cd0b5bccd5fbf44075bf
Author: Gerlando Falauto gerlando.fala...@keymile.com
Date:   Tue Jul 31 10:27:42 2012 +0200

including file2 within file1 as file1

 delete mode 100644 file2
--
[master d7fdea4] including file2 within file1 as file2
 2 files changed, 162 insertions(+), 161 deletions(-)
 delete mode 100644 file1
commit d7fdea4855efe8401562a53ec7093c80390ee274
Author: Gerlando Falauto gerlando.fala...@keymile.com
Date:   Tue Jul 31 10:27:42 2012 +0200

including file2 within file1 as file2

 delete mode 100644 file1
--
=== OUTPUT END 

So including file2 (bigger) within file1 (smaller):
a) gets always (commit+extraction) detected as a rename file2=file1 if 
the merged file is file3 (new file).
b) gets detected as a rename ONLY during commit (but not while 
extracting) if the merged file is file1 (existing file)
c) doesn't ever get detected as a rename if the merged file is file2 
(which makes sense, being file2 more similar to itself than to file1)


So now my two questions:

1) Is the behavior in b) correct? Shouldn't it at least be made consistent?

2) Would it make any sense to track (or detect) such inclusion cases? 
Is there any recommended or standard practice for performing such 
operations as file merge/split (i.e. when refactoring code or something)?


Thanks!
Gerlando
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT smart http vs GIT over ssh

2012-07-31 Thread vishwajeet singh
On Tue, Jul 31, 2012 at 2:17 PM, Michael J Gruber
g...@drmicha.warpmail.net wrote:
 vishwajeet singh venit, vidit, dixit 31.07.2012 05:19:
 On Tue, Jul 31, 2012 at 8:40 AM, Konstantin Khomoutov
 kostix+...@007spb.ru wrote:
 On Tue, Jul 31, 2012 at 08:36:07AM +0530, vishwajeet singh wrote:

 Just wanted to know the difference between smart http and ssh and in
 what scenarios we need them
 I am setting up a git server,  can I just do with smart http support
 or I need to enable the ssh support to use git effectively.
 As I understand github provides both the protocols, what's the reason
 for supporting both protocols.
 http://git-scm.com/book/en/Git-on-the-Server-The-Protocols
 http://git-scm.com/2010/03/04/smart-http.html


 Thanks for the links, I have already gone through those links, was
 looking from implementation perspective do I really need to support
 both protocols on my server or I can just do with smart http and
 what's the preferred way of doing it smart http or ssh



 You need to provide what your users demand ;)

 Seriously, this is why GitHub and other providers offer both. Not only
 are some users more comfortable with one protocol or the other (Win
 users don't prefer ssh generally) but some might be able to use only one
 because of firewalls or corporate rules.

 From the server perspective, the setup is completely different, of
 course. Do you have shell accounts already which you want to reuse for
 ssh+git? Do you prefer setting up a special purpose shell account
 (gitosis/gitolite) or setting up a web server with authentication?

I already have server setup with smart http backend, was just
wondering if my users would really need ssh support or not and I agree
to your point it should be based on user demand.

While going through the git book I encountered a very tall claim about
smart http
 I think this will become the standard Git protocol in the very near
future. I believe this because it's both efficient and can be run
either secure and authenticated (https) or open and unauthenticated
(http). It also has the huge advantage that most firewalls have those
ports (80 and 443) open already and normal users don't have to deal
with ssh-keygen and the like. Once most clients have updated to at
least v1.6.6, http will have a big place in the Git world.

http://git-scm.com/2010/03/04/smart-http.html

Just based on above comment in book I thought if smart http is way to
go for future why to take hassle of setting up ssh.

I was planning to use gitosis as I have python background and code
looks not being maintained from quite sometime, which worries me a
bit, I stumbled upon gitomatic
https://github.com/jorgeecardona/gitomatic, has anyone any prior
experience

Any thoughts or suggestions are welcome.

 Michael



-- 
Vishwajeet Singh
+91-9657702154 | dextrou...@gmail.com | http://bootstraptoday.com
Twitter: http://twitter.com/vishwajeets | LinkedIn:
http://www.linkedin.com/in/singhvishwajeet
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase -i: handle fixup of root commit correctly

2012-07-31 Thread Johannes Sixt

Am 24.07.2012 14:17, schrieb Chris Webb:

There is a bug with git rebase -i --root when a fixup or squash line is
applied to the new root. We attempt to amend the commit onto which they
apply with git reset --soft HEAD^ followed by a normal commit. Unlike a
real commit --amend, this sequence will fail against a root commit as it
has no parent.

Fix rebase -i to use commit --amend for fixup and squash instead, and
add a test for the case of a fixup of the root commit.

Signed-off-by: Chris Webbch...@arachsys.com
---

Sorry, I should have spotted this issue when I did the original root-rebase
series. I've checked that this patch doesn't break any of the existing
tests, as well as satisfying the newly introduced check for the root-fixup
case.

  git-rebase--interactive.sh| 25 +
  t/t3404-rebase-interactive.sh |  8 
  2 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
index bef7bc0..0d2056f 100644
--- a/git-rebase--interactive.sh
+++ b/git-rebase--interactive.sh
@@ -493,25 +493,28 @@ do_next () {
author_script_content=$(get_author_ident_from_commit HEAD)
echo $author_script_content  $author_script
eval $author_script_content
-   output git reset --soft HEAD^
-   pick_one -n $sha1 || die_failed_squash $sha1 $rest
+   if ! pick_one -n $sha1
+   then
+   git rev-parse --verify HEAD$amend
+   die_failed_squash $sha1 $rest
+   fi
case $(peek_next_command) in
squash|s|fixup|f)
# This is an intermediate commit; its message will only 
be
# used in case of trouble.  So use the long version:
-   do_with_author output git commit --no-verify -F 
$squash_msg ||
+   do_with_author output git commit --amend --no-verify -F 
$squash_msg ||
die_failed_squash $sha1 $rest


This new sequence looks *VERY* suspicious. It makes a HUGE difference in 
what is left behind if the cherry-pick fails. Did you think about what 
happens when the cherry-pick fails in a squash+squash+fixup+fixup sequence 
(or any combination thereof) and then the rebase is continued (after a 
manual resolution)?



;;
*)
# This is the final command of this squash/fixup group
if test -f $fixup_msg
then
-   do_with_author git commit --no-verify -F 
$fixup_msg ||
+   do_with_author git commit --amend --no-verify -F 
$fixup_msg ||
die_failed_squash $sha1 $rest
else
cp $squash_msg $GIT_DIR/SQUASH_MSG || exit
rm -f $GIT_DIR/MERGE_MSG
-   do_with_author git commit --no-verify -e ||
+   do_with_author git commit --amend --no-verify -F 
$GIT_DIR/SQUASH_MSG -e ||
die_failed_squash $sha1 $rest
fi
rm -f $squash_msg $fixup_msg
@@ -748,7 +751,6 @@ In both case, once you're done, continue with:
fi
. $author_script ||
die Error trying to find the author identity to amend 
commit
-   current_head=
if test -f $amend
then
current_head=$(git rev-parse --verify HEAD)
@@ -756,13 +758,12 @@ In both case, once you're done, continue with:
die \
  You have uncommitted changes in your working tree. Please, commit them
  first and then run 'git rebase --continue' again.
-   git reset --soft HEAD^ ||
-   die Cannot rewind the HEAD
+   do_with_author git commit --amend --no-verify -F $msg 
-e ||
+   die Could not commit staged changes.
+   else
+   do_with_author git commit --no-verify -F $msg -e ||
+   die Could not commit staged changes.
fi
-   do_with_author git commit --no-verify -F $msg -e || {
-   test -n $current_head  git reset --soft 
$current_head
-   die Could not commit staged changes.
-   }
fi

record_in_rewritten $(cat $state_dir/stopped-sha)
diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
index 8078db6..3f75d32 100755
--- a/t/t3404-rebase-interactive.sh
+++ b/t/t3404-rebase-interactive.sh
@@ -903,4 +903,12 @@ test_expect_success 'rebase -i --root temporary sentinel 
commit' '
git rebase --abort
  '

+test_expect_success 

Re: GIT smart http vs GIT over ssh

2012-07-31 Thread Michael J Gruber
vishwajeet singh venit, vidit, dixit 31.07.2012 11:04:
 On Tue, Jul 31, 2012 at 2:17 PM, Michael J Gruber
 g...@drmicha.warpmail.net wrote:
 vishwajeet singh venit, vidit, dixit 31.07.2012 05:19:
 On Tue, Jul 31, 2012 at 8:40 AM, Konstantin Khomoutov
 kostix+...@007spb.ru wrote:
 On Tue, Jul 31, 2012 at 08:36:07AM +0530, vishwajeet singh wrote:

 Just wanted to know the difference between smart http and ssh and in
 what scenarios we need them
 I am setting up a git server,  can I just do with smart http support
 or I need to enable the ssh support to use git effectively.
 As I understand github provides both the protocols, what's the reason
 for supporting both protocols.
 http://git-scm.com/book/en/Git-on-the-Server-The-Protocols
 http://git-scm.com/2010/03/04/smart-http.html


 Thanks for the links, I have already gone through those links, was
 looking from implementation perspective do I really need to support
 both protocols on my server or I can just do with smart http and
 what's the preferred way of doing it smart http or ssh



 You need to provide what your users demand ;)

 Seriously, this is why GitHub and other providers offer both. Not only
 are some users more comfortable with one protocol or the other (Win
 users don't prefer ssh generally) but some might be able to use only one
 because of firewalls or corporate rules.

 From the server perspective, the setup is completely different, of
 course. Do you have shell accounts already which you want to reuse for
 ssh+git? Do you prefer setting up a special purpose shell account
 (gitosis/gitolite) or setting up a web server with authentication?

 I already have server setup with smart http backend, was just
 wondering if my users would really need ssh support or not and I agree
 to your point it should be based on user demand.
 
 While going through the git book I encountered a very tall claim about
 smart http
  I think this will become the standard Git protocol in the very near
 future. I believe this because it's both efficient and can be run
 either secure and authenticated (https) or open and unauthenticated
 (http). It also has the huge advantage that most firewalls have those
 ports (80 and 443) open already and normal users don't have to deal
 with ssh-keygen and the like. Once most clients have updated to at
 least v1.6.6, http will have a big place in the Git world.
 
 http://git-scm.com/2010/03/04/smart-http.html
 
 Just based on above comment in book I thought if smart http is way to
 go for future why to take hassle of setting up ssh.

There is no need to set up ssh if smart http does the job for you. I
don't think it makes a difference performance-wise on the server
(upload-pack vs. http-backend) but others are more proficient in this area.

I'm sure ssh+git is there to stay, it is just ordinary anonymous git
protocol tunneled through ssh. So, it's as future-proof as git is.

 I was planning to use gitosis as I have python background and code
 looks not being maintained from quite sometime, which worries me a
 bit, I stumbled upon gitomatic
 https://github.com/jorgeecardona/gitomatic, has anyone any prior
 experience

No idea about gitomatic. It looks a bit like gitolite in python
(alpha) but doesn't say much about it's ancestry.

There's also gitolite which is actively maintained and used. Basically,
it's gitosis in perl. Sitaram, forgive me ;)

Michael
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT smart http vs GIT over ssh

2012-07-31 Thread Sitaram Chamarty
On Tue, Jul 31, 2012 at 2:50 PM, Michael J Gruber
g...@drmicha.warpmail.net wrote:
 vishwajeet singh venit, vidit, dixit 31.07.2012 11:04:
 On Tue, Jul 31, 2012 at 2:17 PM, Michael J Gruber
 g...@drmicha.warpmail.net wrote:
 vishwajeet singh venit, vidit, dixit 31.07.2012 05:19:
 On Tue, Jul 31, 2012 at 8:40 AM, Konstantin Khomoutov
 kostix+...@007spb.ru wrote:
 On Tue, Jul 31, 2012 at 08:36:07AM +0530, vishwajeet singh wrote:

 Just wanted to know the difference between smart http and ssh and in
 what scenarios we need them
 I am setting up a git server,  can I just do with smart http support
 or I need to enable the ssh support to use git effectively.
 As I understand github provides both the protocols, what's the reason
 for supporting both protocols.
 http://git-scm.com/book/en/Git-on-the-Server-The-Protocols
 http://git-scm.com/2010/03/04/smart-http.html


 Thanks for the links, I have already gone through those links, was
 looking from implementation perspective do I really need to support
 both protocols on my server or I can just do with smart http and
 what's the preferred way of doing it smart http or ssh



 You need to provide what your users demand ;)

 Seriously, this is why GitHub and other providers offer both. Not only
 are some users more comfortable with one protocol or the other (Win
 users don't prefer ssh generally) but some might be able to use only one
 because of firewalls or corporate rules.

 From the server perspective, the setup is completely different, of
 course. Do you have shell accounts already which you want to reuse for
 ssh+git? Do you prefer setting up a special purpose shell account
 (gitosis/gitolite) or setting up a web server with authentication?

 I already have server setup with smart http backend, was just
 wondering if my users would really need ssh support or not and I agree
 to your point it should be based on user demand.

 While going through the git book I encountered a very tall claim about
 smart http
  I think this will become the standard Git protocol in the very near
 future. I believe this because it's both efficient and can be run
 either secure and authenticated (https) or open and unauthenticated
 (http). It also has the huge advantage that most firewalls have those
 ports (80 and 443) open already and normal users don't have to deal
 with ssh-keygen and the like. Once most clients have updated to at
 least v1.6.6, http will have a big place in the Git world.

 http://git-scm.com/2010/03/04/smart-http.html

 Just based on above comment in book I thought if smart http is way to
 go for future why to take hassle of setting up ssh.

 There is no need to set up ssh if smart http does the job for you. I
 don't think it makes a difference performance-wise on the server
 (upload-pack vs. http-backend) but others are more proficient in this area.

 I'm sure ssh+git is there to stay, it is just ordinary anonymous git
 protocol tunneled through ssh. So, it's as future-proof as git is.

 I was planning to use gitosis as I have python background and code
 looks not being maintained from quite sometime, which worries me a
 bit, I stumbled upon gitomatic
 https://github.com/jorgeecardona/gitomatic, has anyone any prior
 experience

 No idea about gitomatic. It looks a bit like gitolite in python
 (alpha) but doesn't say much about it's ancestry.

 There's also gitolite which is actively maintained and used. Basically,
 it's gitosis in perl. Sitaram, forgive me ;)

oh that's quite alright.  People forget that gitolite was, for the
first 3 days of its life, called gitosis-lite :)

-- 
Sitaram
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix git-svn for SVN 1.7

2012-07-31 Thread Michael G Schwern
It just doesn't matter.

Why are we arguing over which solution will be 4% better two years from now,
or if my commits are formatted perfectly, when tremendous amounts of basic
work to be done improving git-svn?  The code is undocumented, lacking unit
tests, difficult to understand and riddled with bugs.

Either solution would be a vast improvement.  The most important thing is that
one of them actually gets done.  If both solutions offer a huge improvement,
do it the way the person actually writing the code wants to do it.  It'll be
more enjoyable for them, they'll be more likely to complete the work, and more
likely to stick around and code some more.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase -i: handle fixup of root commit correctly

2012-07-31 Thread Chris Webb
Johannes Sixt j...@kdbg.org writes:

 Am 24.07.2012 14:17, schrieb Chris Webb:
 diff --git a/git-rebase--interactive.sh b/git-rebase--interactive.sh
 index bef7bc0..0d2056f 100644
 --- a/git-rebase--interactive.sh
 +++ b/git-rebase--interactive.sh
 @@ -493,25 +493,28 @@ do_next () {
  author_script_content=$(get_author_ident_from_commit HEAD)
  echo $author_script_content $author_script
  eval $author_script_content
 -output git reset --soft HEAD^
 -pick_one -n $sha1 || die_failed_squash $sha1 $rest
 +if ! pick_one -n $sha1
 +then
 +git rev-parse --verify HEAD $amend
 +die_failed_squash $sha1 $rest
 +fi
  case $(peek_next_command) in
  squash|s|fixup|f)
  # This is an intermediate commit; its message will only 
  be
  # used in case of trouble.  So use the long version:
 -do_with_author output git commit --no-verify -F 
 $squash_msg ||
 +do_with_author output git commit --amend --no-verify -F 
 $squash_msg ||
  die_failed_squash $sha1 $rest
 
 This new sequence looks *VERY* suspicious. It makes a HUGE
 difference in what is left behind if the cherry-pick fails. Did you
 think about what happens when the cherry-pick fails in a
 squash+squash+fixup+fixup sequence (or any combination thereof) and
 then the rebase is continued (after a manual resolution)?

I had to deal with the case where there's a conflict while picking the
squash/fixup, and we have to ensure we commit --amend in rebase --continue.
This is why I've written

  git rev-parse --verify HEAD $amend

in the above, to use the pre-existing support for amending the HEAD commit
in rebase --continue. (We test for this fixup-conflict case in various ways
in t3404 and not doing an amend there would result in double commits and
spectacular test breakage.)

Is this the issue you mean here, or is it something more subtle which I'm
not properly following?

If we have a conflict in the middle of a chain of fixup/squashes, as far as
I can see, we have a HEAD with all the previous successful fixups applied,
conflict markers for the current failed pick, and when the conflict has been
resolved, git rebase --continue will commit --amend the resolution and
continue? Isn't that the correct behaviour here?

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GIT smart http vs GIT over ssh

2012-07-31 Thread vishwajeet singh
On Tue, Jul 31, 2012 at 2:50 PM, Michael J Gruber
g...@drmicha.warpmail.net wrote:
 vishwajeet singh venit, vidit, dixit 31.07.2012 11:04:
 On Tue, Jul 31, 2012 at 2:17 PM, Michael J Gruber
 g...@drmicha.warpmail.net wrote:
 vishwajeet singh venit, vidit, dixit 31.07.2012 05:19:
 On Tue, Jul 31, 2012 at 8:40 AM, Konstantin Khomoutov
 kostix+...@007spb.ru wrote:
 On Tue, Jul 31, 2012 at 08:36:07AM +0530, vishwajeet singh wrote:

 Just wanted to know the difference between smart http and ssh and in
 what scenarios we need them
 I am setting up a git server,  can I just do with smart http support
 or I need to enable the ssh support to use git effectively.
 As I understand github provides both the protocols, what's the reason
 for supporting both protocols.
 http://git-scm.com/book/en/Git-on-the-Server-The-Protocols
 http://git-scm.com/2010/03/04/smart-http.html


 Thanks for the links, I have already gone through those links, was
 looking from implementation perspective do I really need to support
 both protocols on my server or I can just do with smart http and
 what's the preferred way of doing it smart http or ssh



 You need to provide what your users demand ;)

 Seriously, this is why GitHub and other providers offer both. Not only
 are some users more comfortable with one protocol or the other (Win
 users don't prefer ssh generally) but some might be able to use only one
 because of firewalls or corporate rules.

 From the server perspective, the setup is completely different, of
 course. Do you have shell accounts already which you want to reuse for
 ssh+git? Do you prefer setting up a special purpose shell account
 (gitosis/gitolite) or setting up a web server with authentication?

 I already have server setup with smart http backend, was just
 wondering if my users would really need ssh support or not and I agree
 to your point it should be based on user demand.

 While going through the git book I encountered a very tall claim about
 smart http
  I think this will become the standard Git protocol in the very near
 future. I believe this because it's both efficient and can be run
 either secure and authenticated (https) or open and unauthenticated
 (http). It also has the huge advantage that most firewalls have those
 ports (80 and 443) open already and normal users don't have to deal
 with ssh-keygen and the like. Once most clients have updated to at
 least v1.6.6, http will have a big place in the Git world.

 http://git-scm.com/2010/03/04/smart-http.html

 Just based on above comment in book I thought if smart http is way to
 go for future why to take hassle of setting up ssh.

 There is no need to set up ssh if smart http does the job for you. I
 don't think it makes a difference performance-wise on the server
 (upload-pack vs. http-backend) but others are more proficient in this area.

 I'm sure ssh+git is there to stay, it is just ordinary anonymous git
 protocol tunneled through ssh. So, it's as future-proof as git is.

Now I think I should add ssh support :-)

let me know anything I need to be careful about from security
perspective while implementing this and any general pointers, mean
while I will go through the docs as well.

 I was planning to use gitosis as I have python background and code
 looks not being maintained from quite sometime, which worries me a
 bit, I stumbled upon gitomatic
 https://github.com/jorgeecardona/gitomatic, has anyone any prior
 experience

 No idea about gitomatic. It looks a bit like gitolite in python
 (alpha) but doesn't say much about it's ancestry.

 There's also gitolite which is actively maintained and used. Basically,
 it's gitosis in perl. Sitaram, forgive me ;)

I am looking for something based on python, I gave up using perl long
time back and there's no going back.
Will explore gitosis code and see if it can be used

Thanks for your answers it really helped me.

 Michael



-- 
Vishwajeet Singh
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase -i: handle fixup of root commit correctly

2012-07-31 Thread Chris Webb
Chris Webb ch...@arachsys.com writes:

 If we have a conflict in the middle of a chain of fixup/squashes, as far as
 I can see, we have a HEAD with all the previous successful fixups applied,
 conflict markers for the current failed pick, and when the conflict has been
 resolved, git rebase --continue will commit --amend the resolution and
 continue? Isn't that the correct behaviour here?

As an explicit test, I've just tried a chain of four squashed commits, each
of which deliberately resulted in a conflict to manually resolve. For each
squash, I was left with conflict markers on top of what had already been
squashed in the expected way, and when I continued after resolving these,
the resolution was 'commit --amend'ed in the expected way, with the same
behaviour and resulting commit at the end of the rebase -i as I get with a
copy of git without this patch.

Cheers,

Chris.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Return code in cmd (git describe)

2012-07-31 Thread Manuela Hutter
On Tue, 31 Jul 2012 14:27:15 +0200, Konstantin Khomoutov  
flatw...@users.sourceforge.net wrote:



On Tue, 31 Jul 2012 14:02:50 +0200
Manuela Hutter manue...@opera.com wrote:


we have some python scripts that are run from Visual Studio, and one
of them fails because of a wrong git return code when calling
'git describe --dirty'

[...]

Run from wingw, the return code is 0.
Run from cmd, the return code is 1.

Why?

Supposedly, when you run Git from cmd.exe, the git.cmd top-level script
(which wraps the git.exe binary) gets executed, and it's currently
affected by this bug [1] (fixed).  When you run Git from Git bash, the
git.exe binary runs directly and hence the bug is not exposed.

1. https://github.com/msysgit/msysgit/issues/43


Great, thanks for the info!
/Manuela
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Centralized git

2012-07-31 Thread jaseem abid
On Tue, Jul 31, 2012 at 5:33 PM, Javier Domingo javier...@gmail.com wrote:

 I am currently planifying a 3D project, and I will be having large binary
 files. If I add a distributed VCS, the amount of disk space required will
 increase significantly.


You are going to transfer something that wont fit into your hard disk
up and down your network once in a while ?

I assume disk to be cheaper than network.

--
Jaseem Abid
http://jaseemabid.github.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[WIP PATCH] Manual rename correction

2012-07-31 Thread Nguyen Thai Ngoc Duy
Git's rename detection is good but still not perfect. There have been
a few times I wanted to correct git for better output but I
couldn't. This PoC WIP patch attempts to address that. It allows to
break/rearrange any file pairs. We can do something crazy like this:

 attr.c = dir.c  | 1786 -
 dir.c = attr.c  | 1788 +-
 t/t1306-xdg-files.sh |   39 ++
 t/test-lib.sh|1 +
 4 files changed, 1828 insertions(+), 1786 deletions(-)

The above output is done with git diff --manual-rename=foo A B
and foo contains (probably not in the best format though)

-- 8 --
attr.c dir.c
dir.c attr.c
-- 8 --

The plan is to use git-notes to record rename corrections like above
so that git log --patch for example can make use of them. I'm not
sure what to do with merge commits yet (can we track renames in a
merge?). We can generate rename file from diff -CM, then users can
edit and save it.

If you want to diff between two arbitrary trees, you'll have to feed
rename corrections via command line as git-notes are for commit diff
only.

In some cases, manual rename may be cheaper than --find-copies-harder,
so this feature could help reduce cpu usage. Though that's not my main
aim.

Oh and I think rename detection in diff other than tree-tree does not
work. Maybe I tested it the wrong way?

Comments?

-- 8 --
diff --git a/diff.c b/diff.c
index 62cbe14..c8d55d2 100644
--- a/diff.c
+++ b/diff.c
@@ -3547,6 +3547,12 @@ int diff_opt_parse(struct diff_options *options, const 
char **av, int ac)
DIFF_OPT_SET(options, RENAME_EMPTY);
else if (!strcmp(arg, --no-rename-empty))
DIFF_OPT_CLR(options, RENAME_EMPTY);
+   else if (!prefixcmp(arg, --manual-rename=)) {
+   int ret = strbuf_read_file(options-renames, arg + 16, 0);
+   if (ret == -1)
+   die(unable to read %s, arg + 16);
+   DIFF_OPT_SET(options, MANUAL_RENAME);
+   }
else if (!strcmp(arg, --relative))
DIFF_OPT_SET(options, RELATIVE_NAME);
else if (!prefixcmp(arg, --relative=)) {
@@ -4621,6 +4627,8 @@ void diffcore_std(struct diff_options *options)
if (options-skip_stat_unmatch)
diffcore_skip_stat_unmatch(options);
if (!options-found_follow) {
+   if (DIFF_OPT_TST(options, MANUAL_RENAME))
+   diffcore_manual_rename(options);
/* See try_to_follow_renames() in tree-diff.c */
if (options-break_opt != -1)
diffcore_break(options-break_opt);
diff --git a/diff.h b/diff.h
index e027650..60d104e 100644
--- a/diff.h
+++ b/diff.h
@@ -61,6 +61,7 @@ typedef struct strbuf *(*diff_prefix_fn_t)(struct 
diff_options *opt, void *data)
 #define DIFF_OPT_FIND_COPIES_HARDER  (1   6)
 #define DIFF_OPT_FOLLOW_RENAMES  (1   7)
 #define DIFF_OPT_RENAME_EMPTY(1   8)
+#define DIFF_OPT_MANUAL_RENAME   (1   9)
 /* (1   9) unused */
 #define DIFF_OPT_HAS_CHANGES (1  10)
 #define DIFF_OPT_QUICK   (1  11)
@@ -147,6 +148,7 @@ struct diff_options {
int close_file;
 
struct pathspec pathspec;
+   struct strbuf renames;
change_fn_t change;
add_remove_fn_t add_remove;
diff_format_fn_t format_callback;
diff --git a/diffcore-rename.c b/diffcore-rename.c
index 216a7a4..05da99f 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -722,3 +722,148 @@ void diffcore_rename(struct diff_options *options)
rename_src_nr = rename_src_alloc = 0;
return;
 }
+
+struct rename {
+   char *one, *two;
+   struct rename *next_one, *next_two;
+   struct diff_filespec *spec_one;
+   struct diff_filespec *spec_two;
+};
+
+static unsigned int string_hash(const char *s)
+{
+   unsigned int v = 1;
+   while (s  *s)
+   v += (unsigned char)*s++;
+   return v;
+}
+
+void diffcore_manual_rename(struct diff_options *options)
+{
+   struct rename *renames = NULL;
+   int i, nr = 0, alloc = 0;
+   const char *next, *p, *end;
+   struct hash_table hash_one, hash_two;
+   struct diff_queue_struct *q = diff_queued_diff;
+   struct diff_queue_struct outq;
+
+   /* parse rename instructions */
+   end = options-renames.buf + options-renames.len;
+   for (p = options-renames.buf; p  end; p = next) {
+   struct rename *r;
+   const char *sep, *nl, *next_sep;
+
+   nl = strchr(p, '\n');
+   if (!nl)
+   nl = next = end;
+   else {
+   next = nl + 1;
+   if (p == nl)
+   continue;
+   }
+
+   /* one space to separate two paths (for now, quoting can come 
later) */
+   sep = strchr(p, ' ');
+   if (!sep || sep = nl)
+ 

Re: Centralized git

2012-07-31 Thread Ævar Arnfjörð Bjarmason
On Tue, Jul 31, 2012 at 3:08 PM, Javier Domingo javier...@gmail.com wrote:
 Network, in this case is cheaper. The thing is that If I commit
 frecuently, will have plenty of GBs of history, that nearly for sure I
 won't use. I just need to have other people's work to merge. But I
 want to think in Git style, I am pretty accustomed to that way of
 doing things. That is why I sent this mail here.

 The idea is that if I modify 700MBs of video, with 20 commits I would
 get in 21GB. And making a pull would be... just even more horrible
 than anything. That is why I need to have also last checkouts filter.
 Just download branch's HEADs.

You're obviously aware of git-annex, is there any reason you can't
just use that?

That would give you what you want, you'd have a moving window of
current files, and then you'd delete old files as they become
un-needed.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Centralized git

2012-07-31 Thread Edward Toroshchin
Javier,

Are you sure you need git for those big binary files at all?

Branching makes sense only if merging makes sense, and I can hardly see
how you can merge three 700-megabyte video files.

-- 
Edward Hades Toroshchin
dr_lepper on irc.freenode.org
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP PATCH] Manual rename correction

2012-07-31 Thread Junio C Hamano
Nguyen Thai Ngoc Duy pclo...@gmail.com writes:

 The above output is done with git diff --manual-rename=foo A B
 and foo contains (probably not in the best format though)

 -- 8 --
 attr.c dir.c
 dir.c attr.c
 -- 8 --
 ...
 Comments?

It is a good direction to go in, I would think, to give users a way
to explicitly tell that in comparison between these two trees, I
know path B in the postimage corresponds to path A in the preimage.

I however wonder why you did this as a separate function that only
does the explicitly marked ones.  Probably it was easier as a POC to
do it this way, and that is fine.

The real version should do this in the same diffcore_rename()
function, by excluding the paths that the user explicitly told you
about from the the automatic matching logic, and instead matching
them up manually; then you can let the remainder of the paths be
paired by the existing code.

Notice how the non-nullness of rename_dst[i].pair is used as a cue
to skip the similarity computation in the expensive matrix part of
diffcore_rename()?  That comes from find_exact_renames() that is
called earlier in the function.  I would imagine that your logic
would fit _before_ we call find_exact_renames() as a call to a new
helper function (e.g. record_explicit_renames() perhaps).
Anything that reduces the cost in the matrix part should come
earlier, as that reduces the number of pairs we would need to try
matching up.

We might want to introduce a way to express the similarity score for
such a filepair that was manually constructed when showing the
result, though.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC] grep: add a perlRegexp configuration option

2012-07-31 Thread J Smith
Enables the -P flag for perl regexps by default. When both the
perlRegexp and extendedRegexp options are enabled, the last enabled
option wins.
---
 Documentation/config.txt   |  6 ++
 Documentation/git-grep.txt |  6 ++
 builtin/grep.c | 17 +++--
 t/t7810-grep.sh| 34 ++
 4 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a95e5a4..ff3019b 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1213,6 +1213,12 @@ grep.lineNumber::
 grep.extendedRegexp::
If set to true, enable '--extended-regexp' option by default.

+grep.perlRegexp::
+   If set to true, enable '--perl-regexp' option by default.
+
+When both the 'grep.extendedRegexp' and 'grep.perlRegexp' options
+are used, the last enabled option wins.
+
 gpg.program::
Use this custom program instead of gpg found on $PATH when
making or verifying a PGP signature. The program must support the
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 3bec036..8816968 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -45,6 +45,12 @@ grep.lineNumber::
 grep.extendedRegexp::
If set to true, enable '--extended-regexp' option by default.

+grep.perlRegexp::
+   If set to true, enable '--perl-regexp' option by default.
+
+When both the 'grep.extendedRegexp' and 'grep.perlRegexp' options
+are used, the last enabled option wins.
+

 OPTIONS
 ---
diff --git a/builtin/grep.c b/builtin/grep.c
index 29adb0a..b4475e6 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -268,11 +268,24 @@ static int grep_config(const char *var, const char 
*value, void *cb)
if (userdiff_config(var, value)  0)
return -1;

+   if (!strcmp(var, grep.perlregexp)) {
+   if (git_config_bool(var, value)) {
+   opt-fixed = 0;
+   opt-pcre = 1;
+   } else {
+   opt-pcre = 0;
+   }
+   return 0;
+   }
+
if (!strcmp(var, grep.extendedregexp)) {
-   if (git_config_bool(var, value))
+   if (git_config_bool(var, value)) {
opt-regflags |= REG_EXTENDED;
-   else
+   opt-pcre = 0;
+   opt-fixed = 0;
+   } else {
opt-regflags = ~REG_EXTENDED;
+   }
return 0;
}

diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index 24e9b19..5479dc9 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -729,6 +729,40 @@ test_expect_success LIBPCRE 'grep -P pattern' '
test_cmp expected actual
 '

+test_expect_success LIBPCRE 'grep pattern with grep.perlRegexp=true' '
+   git \
+   -c grep.perlregexp=true \
+   grep \p{Ps}.*?\p{Pe} hello.c actual 
+   test_cmp expected actual
+'
+
+test_expect_success LIBPCRE 'grep pattern with grep.perlRegexp=true and then 
grep.extendedRegexp=true' '
+   test_must_fail git \
+   -c grep.perlregexp=true \
+   -c grep.extendedregexp=true \
+   grep \p{Ps}.*?\p{Pe} hello.c
+'
+
+test_expect_success LIBPCRE 'grep pattern with grep.extendedRegexp=true and 
then grep.perlRegexp=true' '
+   git \
+   -c grep.extendedregexp=true \
+   -c grep.perlregexp=true \
+   grep \p{Ps}.*?\p{Pe} hello.c actual 
+   test_cmp expected actual
+'
+
+test_expect_success LIBPCRE 'grep -E pattern with grep.perlRegexp=true' '
+   test_must_fail git \
+   -c grep.perlregexp=true \
+   grep -E \p{Ps}.*?\p{Pe} hello.c
+'
+
+test_expect_success LIBPCRE 'grep -G pattern with grep.perlRegexp=true' '
+   test_must_fail git \
+   -c grep.perlregexp=true \
+   grep -G \p{Ps}.*?\p{Pe} hello.c
+'
+
 test_expect_success 'grep pattern with grep.extendedRegexp=true' '
empty 
test_must_fail git -c grep.extendedregexp=true \
--
1.7.11.3

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


I can translate git-scm.com for Russian

2012-07-31 Thread Николай Бровко
Hi! I'm interested in translation git-scm.com to Russian and can do that 
if your site has any features for multilangual support.

You can answer to this mail.
Best regards,
Nick Brovko



smime.p7s
Description: Криптографическая подпись S/MIME


Re: [PATCH 0/2] test results for v1.7.12-rc0 on cygwin

2012-07-31 Thread Ramsay Jones
René Scharfe wrote:
 Am 28.07.2012 20:46, schrieb Ramsay Jones:
 Unfortunately, I was unable to reproduce the final failure in t7810-grep.sh.
 I tried, among other things, to provoke a failure thus:

  $ for i in $(seq 100); do
   if ! ./t7810-grep.sh -i -v; then
   break;
   fi
   done
  $

 but, apart from chewing on the cpu for about 50 minutes, it didn't result
 in a failure. :(

 However, after looking at test 59, it seems to me to be a stale (redundant)
 test. So, patch #2 removes that test! :-D [I wish I could reproduce the
 failure because I don't like not knowing why it failed, but ...]
 
 Removing the test makes sense, since it was needed for --ext-grep only, 
 is relatively expensive and a bit fragile (by depending on MAXARGS).

Indeed.

 I'm slightly worried about the non-reproducible failure, though.

Yep, me too.

 Perhaps a timing issue is involved and chances are higher if you leave 
 out the option -v?

Yes, one of the among other things I tried was to drop the -v, but the
end result was the same. Also, since I have DEFAULT_TEST_TARGET=prove
in my config.mak, I tried:

$ for i in $(seq 100); do
 if ! prove --exec sh t7810-grep.sh; then
 break;
 fi
 done
$

But again, it didn't provoke a failure (it did run quite a bit faster ...).

I have now run this test file in excess of 600 times without failure in the
last two evenings (taking about 5-6 hours wallclock time).
[I haven't come remotely close to running the test-suite 600 times on
cygwin in the last 6 years ...]

So, I'm out of ideas (and will stop trying to reproduce the failure).

ATB,
Ramsay Jones


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] macos: lazily initialize iconv

2012-07-31 Thread Junio C Hamano
In practice, the majority of paths do not have any utf8 character
that needs the canonicalization.  Lazily call iconv_open() and
iconv_close() to avoid unnecessary overhead.

Signed-off-by: Junio C Hamano gits...@pobox.com
---

 * This is not even compile tested, so it needs testing and
   benchmarking, as I do not even know how costly the calls to
   open/close are when we do not have to call iconv() itself.

   This was brought up by Linus (Cc'ed) in http://goo.gl/INWVc

 compat/precompose_utf8.c | 24 ++--
 compat/precompose_utf8.h |  1 +
 2 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
index d40d1b3..63ce89f 100644
--- a/compat/precompose_utf8.c
+++ b/compat/precompose_utf8.c
@@ -67,7 +67,7 @@ void probe_utf8_pathname_composition(char *path, int len)
 
 void precompose_argv(int argc, const char **argv)
 {
-   int i = 0;
+   int i;
const char *oldarg;
char *newarg;
iconv_t ic_precompose;
@@ -75,11 +75,19 @@ void precompose_argv(int argc, const char **argv)
if (precomposed_unicode != 1)
return;
 
+   /* Avoid iconv_open()/iconv_close() if there is nothing to convert */
+   for (i = 0; i  argc; i++) {
+   if (has_utf8(argv[i], (size_t)-1, NULL))
+   break;
+   }
+   if (i  argc)
+   return;
+
ic_precompose = iconv_open(repo_encoding, path_encoding);
if (ic_precompose == (iconv_t) -1)
return;
 
-   while (i  argc) {
+   for (i = 0; i  argc; i++) {
size_t namelen;
oldarg = argv[i];
if (has_utf8(oldarg, (size_t)-1, namelen)) {
@@ -87,7 +95,6 @@ void precompose_argv(int argc, const char **argv)
if (newarg)
argv[i] = newarg;
}
-   i++;
}
iconv_close(ic_precompose);
 }
@@ -106,8 +113,7 @@ PREC_DIR *precompose_utf8_opendir(const char *dirname)
return NULL;
} else {
int ret_errno = errno;
-   prec_dir-ic_precompose = iconv_open(repo_encoding, 
path_encoding);
-   /* if iconv_open() fails, die() in readdir() if needed */
+   prec_dir-iconv_opened = 0;
errno = ret_errno;
}
 
@@ -136,6 +142,11 @@ struct dirent_prec_psx *precompose_utf8_readdir(PREC_DIR 
*prec_dir)
prec_dir-dirent_nfc-d_type = res-d_type;
 
if ((precomposed_unicode == 1)  has_utf8(res-d_name, 
(size_t)-1, NULL)) {
+   if (!prec_dir-iconv_opened) {
+   prec_dir-ic_precompose =
+   iconv_open(repo_encoding, 
path_encoding);
+   prec_dir-iconv_opened = 1;
+   }
if (prec_dir-ic_precompose == (iconv_t)-1) {
die(iconv_open(%s,%s) failed, but needed:\n
precomposed unicode is not 
supported.\n
@@ -181,7 +192,8 @@ int precompose_utf8_closedir(PREC_DIR *prec_dir)
int ret_errno;
ret_value = closedir(prec_dir-dirp);
ret_errno = errno;
-   if (prec_dir-ic_precompose != (iconv_t)-1)
+   if (prec-dir-iconv_opened 
+   (prec_dir-ic_precompose != (iconv_t)-1))
iconv_close(prec_dir-ic_precompose);
free(prec_dir-dirent_nfc);
free(prec_dir);
diff --git a/compat/precompose_utf8.h b/compat/precompose_utf8.h
index 3b73585..8de485e 100644
--- a/compat/precompose_utf8.h
+++ b/compat/precompose_utf8.h
@@ -22,6 +22,7 @@ typedef struct dirent_prec_psx {
 
 typedef struct {
iconv_t ic_precompose;
+   int iconv_opened;
DIR *dirp;
struct dirent_prec_psx *dirent_nfc;
 } PREC_DIR;
-- 
1.7.12.rc1.43.g3fa3e7e

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] grep: add a perlRegexp configuration option

2012-07-31 Thread Junio C Hamano
J Smith dark.pa...@gmail.com writes:

 Enables the -P flag for perl regexps by default. When both the
 perlRegexp and extendedRegexp options are enabled, the last enabled
 option wins.

Turning grep.extendedregexp from boolean to an extended boolean to
allow grep.extendedregexp = perl might be a better alternative.
That way, the user wouldn't have to worry about 7 variants of
grep.fooRegexp variables twenty years down the road, even though the
set of possible values given to grep.extendedregexp may have grown
over time by then.

 ---
  Documentation/config.txt   |  6 ++
  Documentation/git-grep.txt |  6 ++
  builtin/grep.c | 17 +++--
  t/t7810-grep.sh| 34 ++
  4 files changed, 61 insertions(+), 2 deletions(-)

 diff --git a/Documentation/config.txt b/Documentation/config.txt
 index a95e5a4..ff3019b 100644
 --- a/Documentation/config.txt
 +++ b/Documentation/config.txt
 @@ -1213,6 +1213,12 @@ grep.lineNumber::
  grep.extendedRegexp::
   If set to true, enable '--extended-regexp' option by default.

 +grep.perlRegexp::
 + If set to true, enable '--perl-regexp' option by default.
 +
 +When both the 'grep.extendedRegexp' and 'grep.perlRegexp' options
 +are used, the last enabled option wins.
 +
  gpg.program::
   Use this custom program instead of gpg found on $PATH when
   making or verifying a PGP signature. The program must support the
 diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
 index 3bec036..8816968 100644
 --- a/Documentation/git-grep.txt
 +++ b/Documentation/git-grep.txt
 @@ -45,6 +45,12 @@ grep.lineNumber::
  grep.extendedRegexp::
   If set to true, enable '--extended-regexp' option by default.

 +grep.perlRegexp::
 + If set to true, enable '--perl-regexp' option by default.
 +
 +When both the 'grep.extendedRegexp' and 'grep.perlRegexp' options
 +are used, the last enabled option wins.
 +

  OPTIONS
  ---
 diff --git a/builtin/grep.c b/builtin/grep.c
 index 29adb0a..b4475e6 100644
 --- a/builtin/grep.c
 +++ b/builtin/grep.c
 @@ -268,11 +268,24 @@ static int grep_config(const char *var, const char 
 *value, void *cb)
   if (userdiff_config(var, value)  0)
   return -1;

 + if (!strcmp(var, grep.perlregexp)) {
 + if (git_config_bool(var, value)) {
 + opt-fixed = 0;
 + opt-pcre = 1;
 + } else {
 + opt-pcre = 0;
 + }
 + return 0;
 + }
 +
   if (!strcmp(var, grep.extendedregexp)) {
 - if (git_config_bool(var, value))
 + if (git_config_bool(var, value)) {
   opt-regflags |= REG_EXTENDED;
 - else
 + opt-pcre = 0;
 + opt-fixed = 0;
 + } else {
   opt-regflags = ~REG_EXTENDED;
 + }
   return 0;
   }

 diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
 index 24e9b19..5479dc9 100755
 --- a/t/t7810-grep.sh
 +++ b/t/t7810-grep.sh
 @@ -729,6 +729,40 @@ test_expect_success LIBPCRE 'grep -P pattern' '
   test_cmp expected actual
  '

 +test_expect_success LIBPCRE 'grep pattern with grep.perlRegexp=true' '
 + git \
 + -c grep.perlregexp=true \
 + grep \p{Ps}.*?\p{Pe} hello.c actual 
 + test_cmp expected actual
 +'
 +
 +test_expect_success LIBPCRE 'grep pattern with grep.perlRegexp=true and then 
 grep.extendedRegexp=true' '
 + test_must_fail git \
 + -c grep.perlregexp=true \
 + -c grep.extendedregexp=true \
 + grep \p{Ps}.*?\p{Pe} hello.c
 +'
 +
 +test_expect_success LIBPCRE 'grep pattern with grep.extendedRegexp=true and 
 then grep.perlRegexp=true' '
 + git \
 + -c grep.extendedregexp=true \
 + -c grep.perlregexp=true \
 + grep \p{Ps}.*?\p{Pe} hello.c actual 
 + test_cmp expected actual
 +'
 +
 +test_expect_success LIBPCRE 'grep -E pattern with grep.perlRegexp=true' '
 + test_must_fail git \
 + -c grep.perlregexp=true \
 + grep -E \p{Ps}.*?\p{Pe} hello.c
 +'
 +
 +test_expect_success LIBPCRE 'grep -G pattern with grep.perlRegexp=true' '
 + test_must_fail git \
 + -c grep.perlregexp=true \
 + grep -G \p{Ps}.*?\p{Pe} hello.c
 +'
 +
  test_expect_success 'grep pattern with grep.extendedRegexp=true' '
   empty 
   test_must_fail git -c grep.extendedregexp=true \
 --
 1.7.11.3
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] macos: lazily initialize iconv

2012-07-31 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 In practice, the majority of paths do not have any utf8 character
 that needs the canonicalization.  Lazily call iconv_open() and
 iconv_close() to avoid unnecessary overhead.

 Signed-off-by: Junio C Hamano gits...@pobox.com
 ---

  * This is not even compile tested, so it needs testing and
benchmarking, as I do not even know how costly the calls to
open/close are when we do not have to call iconv() itself.

This was brought up by Linus (Cc'ed) in http://goo.gl/INWVc

Even though I also think that per-DIR iconv may not be the optimal
way to organize this (I think iconv_t should be a per-thread thing
at most), it would be a more involved change that needs to be done
by somebody who actually works on Mac, so the patch I sent is kept
deliberately minimum.

Just FYI.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] macos: lazily initialize iconv

2012-07-31 Thread Ralf Thielow
On Tue, Jul 31, 2012 at 7:52 PM, Junio C Hamano gits...@pobox.com wrote:
 +   /* Avoid iconv_open()/iconv_close() if there is nothing to convert */
 +   for (i = 0; i  argc; i++) {
 +   if (has_utf8(argv[i], (size_t)-1, NULL))
 +   break;
 +   }
 +   if (i  argc)
 +   return;
 +

I'm not very familiar with this code but:

before: it reencodes everything which is utf-8 in argv
after: it reencodes *nothing* if one string in argv is not in utf-8

am i wrong?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] macos: lazily initialize iconv

2012-07-31 Thread Junio C Hamano
Ralf Thielow ralf.thie...@gmail.com writes:

 On Tue, Jul 31, 2012 at 7:52 PM, Junio C Hamano gits...@pobox.com wrote:
 +   /* Avoid iconv_open()/iconv_close() if there is nothing to convert */
 +   for (i = 0; i  argc; i++) {
 +   if (has_utf8(argv[i], (size_t)-1, NULL))
 +   break;
 +   }
 +   if (i  argc)
 +   return;
 +

 I'm not very familiar with this code but:

 before: it reencodes everything which is utf-8 in argv
 after: it reencodes *nothing* if one string in argv is not in utf-8

 am i wrong?

You are right.  It should avoid the whole iconv thing if we saw _no_
utf8, i.e. the last two should be:

if (argc = i)
return;

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


post-receive hooks based on push content

2012-07-31 Thread Jessica Lucci
Hey guys,

I'm trying to set up a post-receive hook that redirects code based on
the content of the actual push. Specifically, I'm trying to set up
automatic deployment of web apps with the option of sending the code
to a build farm first. For example, if you push your code to a git
repo, there should be a post-receive hook in there that looks to see
if /www is empty or something. If /www is empty, the app has yet to be
built, so the hook should push the code to a build farm that can
compile the code into a WAR file (or whatever is appropriate). If /www
is already populated, we assume the code has already been compiled and
will then push the code directly onto the node.

So, first of all, is this even possible?
And if so, how would I go about writing this specific hook?

Thanks for your time!
Jessica
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v2] macos: lazily initialize iconv

2012-07-31 Thread Junio C Hamano
In practice, the majority of paths do not have utf8 that needs
the canonicalization. Lazily call iconv_open()/iconv_close() to
avoid unnecessary overhead.

Signed-off-by: Junio C Hamano gits...@pobox.com
Helped-by: Ralf Thielow ralf.thie...@gmail.com
Helped-by: Linus Torvalds torva...@linux-foundation.org
---

 * This is not even compile tested, so it needs testing and
   benchmarking, as I do not even know how costly the calls to
   open/close are when we do not have to call iconv() itself.

   This reroll corrects an obvious thinko pointed out by Ralf, and
   gets rid of an extra iconv_opened field added unnecessarily in
   the previous round.

   This was brought up by Linus in http://goo.gl/INWVc

 compat/precompose_utf8.c | 19 ++-
 1 file changed, 14 insertions(+), 5 deletions(-)

diff --git a/compat/precompose_utf8.c b/compat/precompose_utf8.c
index d40d1b3..79b5528 100644
--- a/compat/precompose_utf8.c
+++ b/compat/precompose_utf8.c
@@ -67,7 +67,7 @@ void probe_utf8_pathname_composition(char *path, int len)
 
 void precompose_argv(int argc, const char **argv)
 {
-   int i = 0;
+   int i;
const char *oldarg;
char *newarg;
iconv_t ic_precompose;
@@ -75,11 +75,19 @@ void precompose_argv(int argc, const char **argv)
if (precomposed_unicode != 1)
return;
 
+   /* Avoid iconv_open()/iconv_close() if there is nothing to convert */
+   for (i = 0; i  argc; i++) {
+   if (has_utf8(argv[i], (size_t)-1, NULL))
+   break;
+   }
+   if (argc = i)
+   return; /* no utf8 found */
+
ic_precompose = iconv_open(repo_encoding, path_encoding);
if (ic_precompose == (iconv_t) -1)
return;
 
-   while (i  argc) {
+   for (i = 0; i  argc; i++) {
size_t namelen;
oldarg = argv[i];
if (has_utf8(oldarg, (size_t)-1, namelen)) {
@@ -87,7 +95,6 @@ void precompose_argv(int argc, const char **argv)
if (newarg)
argv[i] = newarg;
}
-   i++;
}
iconv_close(ic_precompose);
 }
@@ -106,8 +113,7 @@ PREC_DIR *precompose_utf8_opendir(const char *dirname)
return NULL;
} else {
int ret_errno = errno;
-   prec_dir-ic_precompose = iconv_open(repo_encoding, 
path_encoding);
-   /* if iconv_open() fails, die() in readdir() if needed */
+   prec_dir-ic_precompose = (iconv_t)-1;
errno = ret_errno;
}
 
@@ -136,6 +142,9 @@ struct dirent_prec_psx *precompose_utf8_readdir(PREC_DIR 
*prec_dir)
prec_dir-dirent_nfc-d_type = res-d_type;
 
if ((precomposed_unicode == 1)  has_utf8(res-d_name, 
(size_t)-1, NULL)) {
+   if (prec_dir-ic_precompose == (iconv_t)-1)
+   prec_dir-ic_precompose =
+   iconv_open(repo_encoding, 
path_encoding);
if (prec_dir-ic_precompose == (iconv_t)-1) {
die(iconv_open(%s,%s) failed, but needed:\n
precomposed unicode is not 
supported.\n
-- 
1.7.12.rc1.43.g3fa3e7e

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] macos: lazily initialize iconv

2012-07-31 Thread Ralf Thielow
On Tue, Jul 31, 2012 at 8:37 PM, Junio C Hamano gits...@pobox.com wrote:
 +   /* Avoid iconv_open()/iconv_close() if there is nothing to convert */
 +   for (i = 0; i  argc; i++) {
 +   if (has_utf8(argv[i], (size_t)-1, NULL))
 +   break;
 +   }
 +   if (argc = i)
 +   return; /* no utf8 found */

sorry, but argc can never be smaller than i, right?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] macos: lazily initialize iconv

2012-07-31 Thread Junio C Hamano
Ralf Thielow ralf.thie...@gmail.com writes:

 On Tue, Jul 31, 2012 at 8:37 PM, Junio C Hamano gits...@pobox.com wrote:
 +   /* Avoid iconv_open()/iconv_close() if there is nothing to convert */
 +   for (i = 0; i  argc; i++) {
 +   if (has_utf8(argv[i], (size_t)-1, NULL))
 +   break;
 +   }
 +   if (argc = i)
 +   return; /* no utf8 found */

 sorry, but argc can never be smaller than i, right?

Yeah, but it is idiomatic to have an inverse of the exit condition
of the preceding for loop here to catch an early exit, and writing
it as if (i == argc), while technically correct, would break the
pattern.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: post-receive hooks based on push content

2012-07-31 Thread Junio C Hamano
Jessica Lucci jessicalucc...@gmail.com writes:

 Hey guys,

 I'm trying to set up a post-receive hook that redirects code based on
 the content of the actual push. Specifically, I'm trying to set up
 automatic deployment of web apps with the option of sending the code
 to a build farm first. For example, if you push your code to a git
 repo, there should be a post-receive hook in there that looks to see
 if /www is empty or something. If /www is empty, the app has yet to be
 built, so the hook should push the code to a build farm that can
 compile the code into a WAR file (or whatever is appropriate). If /www
 is already populated, we assume the code has already been compiled and
 will then push the code directly onto the node.

 So, first of all, is this even possible?

Should be.

 And if so, how would I go about writing this specific hook?

By writing the necessary pieces and then stringing them together?

 - How do you see /www is empty?  That is one piece.  It is outside
   of the scope of git, but it perhaps involves looking at the
   output of ls /www or something.

 - How do you know if the app has yet to be built?  That is
   another piece and depends on how your build infrasture is set
   up.  It is outside of the scope of git.

 - How do you push the code to a build farm?  You would be using
   git push $there $what, and presumably you know where $there is
   (the repository your build farm reads from).  $what is given as
   the input for post-receive hook, or you can read the tip of the
   ref you care about yourself, as your hook will run in the
   receiving repository of the push.

 - How do you push the code to a node?  That would be left as an
   exercise to the reader ;-), but would be similar to how you push
   to your build farm, I would imagine.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP PATCH] Manual rename correction

2012-07-31 Thread Jeff King
On Tue, Jul 31, 2012 at 09:32:49AM -0700, Junio C Hamano wrote:

 Nguyen Thai Ngoc Duy pclo...@gmail.com writes:
 
  The above output is done with git diff --manual-rename=foo A B
  and foo contains (probably not in the best format though)
 
  -- 8 --
  attr.c dir.c
  dir.c attr.c
  -- 8 --
  ...
  Comments?
 
 It is a good direction to go in, I would think, to give users a way
 to explicitly tell that in comparison between these two trees, I
 know path B in the postimage corresponds to path A in the preimage.

I do not think that is the right direction. Let's imagine that I have a
commit A and I annotate it (via notes or whatever) to say between
A^^{tree} and A^{tree}, foo.c became bar.c. That will help me when
doing git show or git log. But it will not help me when I later try
to merge A (or its descendent). In that case, I will compute the diff
between A and the merge-base (or worse, some descendent of A and the
merge-base), and I will miss this hint entirely.

A much better hint is to annotate pairs of sha1s, to say do not bother
doing inexact rename correlation on this pair; I promise that they have
value N. Then it will find that pair no matter which trees or commits
are being diffed, and it will do so relatively inexpensively[1].

That is not fool-proof, of course. You might have a manual rename from
sha1 X to sha1 Y, and then a slight modification to Y to make Z. So you
would want some kind of transitivity to notice that X and Z correlate.
I think you could model it as a graph problem; sha1s are nodes, and each
this is a rename pair of annotated sha1s has an edge between them.
They are the same file if there is a path.

Of course that gives you bizarre and counter-intuitive results, because
X and Z might not actually be that similar. And that is why we have
rename detection in the first place. The idea of file identity (which
this fundamentally is) leads to these sorts of weird results.

I'm sure you could get better results by weakening the transitivity
according to the rename score, or something like that. But now you are
getting pretty complex.

-Peff

[1] We could actually cache rename results by storing pairs of sha1s
along with their rename score, and should be able to get a good
speedup (we are still src*dst in comparing, but now the comparison
is a simple table lookup rather than loading the blobs and computing
the differences). If we had such a cache, then manually marking a
rename would just be a matter of priming the cache with your manual
entries.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC v2 01/16] Implement a remote helper for svn in C.

2012-07-31 Thread Florian Achleitner
On Monday 30 July 2012 09:28:27 Junio C Hamano wrote:
 Florian Achleitner florian.achleitner.2.6...@gmail.com writes:
  Enables basic fetching from subversion repositories. When processing
  Remote URLs starting with svn::, git invokes this remote-helper.
  It starts svnrdump to extract revisions from the subversion repository in
  the 'dump file format', and converts them to a git-fast-import stream
  using the functions of vcs-svn/.
  
  Imported refs are created in a private namespace at
  refs/svn/remote-name/master. The revision history is imported linearly
  (no branch detection) and completely, i.e. from revision 0 to HEAD.
  
  Signed-off-by: Florian Achleitner florian.achleitner.2.6...@gmail.com
  ---
  
   contrib/svn-fe/remote-svn.c |  190
   +++ 1 file changed, 190
   insertions(+)
   create mode 100644 contrib/svn-fe/remote-svn.c
  
  diff --git a/contrib/svn-fe/remote-svn.c b/contrib/svn-fe/remote-svn.c
  new file mode 100644
  index 000..d5c2df8
  --- /dev/null
  +++ b/contrib/svn-fe/remote-svn.c
  @@ -0,0 +1,190 @@
  +
  +#include cache.h
  +#include remote.h
  +#include strbuf.h
  +#include url.h
  +#include exec_cmd.h
  +#include run-command.h
  +#include svndump.h
  +#include argv-array.h
  +
  +static const char *url;
  +static const char *private_ref;
  +static const char *remote_ref = refs/heads/master;
  +
  +int cmd_capabilities(struct strbuf *line);
  +int cmd_import(struct strbuf *line);
  +int cmd_list(struct strbuf *line);
 
 How many of these and other symbols are necessary to be visible
 outside this file?

Will check and make them static.

 
  +typedef int (*input_command_handler)(struct strbuf *);
  +struct input_command_entry {
  +   const char *name;
  +   input_command_handler fct;
  +   unsigned char batchable;/* whether the command starts or is 
  part of a
  batch */ +};
  +
  +static const struct input_command_entry input_command_list[] = {
  +   { capabilities, cmd_capabilities, 0 },
  +   { import, cmd_import, 1 },
  +   { list, cmd_list, 0 },
  +   { NULL, NULL }
  +};
  +
  +int cmd_capabilities(struct strbuf *line)
  +{
  +   printf(import\n);
  +   printf(refspec %s:%s\n\n, remote_ref, private_ref);
  +   fflush(stdout);
  +   return 0;
  +}
  +
  +static void terminate_batch() {
  +   /* terminate a current batch's fast-import stream */
 
 Style:
 
   static void terminate_batch(void)
   {
   /* terminate ...
 

Ok. Opening braces in new lines, right? But inside functions it's ok to have 
them on the same line?

  +   printf(done\n);
  +   fflush(stdout);
  +}
  +
  +int cmd_import(struct strbuf *line)
  +{
  +   int code, report_fd;
  +   char *back_pipe_env;
  +   int dumpin_fd;
  +   unsigned int startrev = 0;
  +   struct argv_array svndump_argv = ARGV_ARRAY_INIT;
  +   struct child_process svndump_proc;
  +
  +   /*
  +* When the remote-helper is invoked by transport-helper.c it passes 
the
  +* filename of this pipe in the env-var.
  +*/
 
 s/ it passes/, /;
 
  +   back_pipe_env = getenv(GIT_REPORT_FIFO);
 
 Can we name back pipe, report fifo and report fd more
 consistently and descriptively?
 
 What kind of REPORT are we talking about here?  Is it to carry the
 contents of

This topic (pipe vs. fifo) is still under discussion with Jonathan. I called it 
REPORT, because that was the name of it in vcs-svn. That will change.

 
  +   if (!back_pipe_env) {
  +   die(Cannot get cat-blob-pipe from environment! GIT_REPORT_FIFO 
  has 
to
  +   be set by the caller.);
  +   }
 
 Style: unnecesary {} block around a simple statement.  It is OK to
 have such a block early in a series if you add more statements to it
 in later steps, but that does not seem to be the case for this patch
 series.

ack.

 
  +   /*
  +* Opening a fifo for reading usually blocks until a writer has opened
  it too. +* Opening a fifo for writing usually blocks until a reader has
  opened it too. + * Therefore, we open with RDWR on both sides to avoid
  deadlocks. + * Especially if there's nothing to do and one pipe end 
  is
  closed immediately. +*/
 
 This smells somewhat fishy justification.  Are we reading what we
 wrote to the fifo?  Who is expected to come at the other end of the
 fifo?  Is it this process that creates that other process?  Perhaps
 you should open it _after_ spawning the process, telling it to open
 the same fifo for writing (if you are sitting on the reading end)?

Of course, it's a workaround. The fifo is from fast-import to the remote-
helper. I explained the deadlocks that can occur in a mail some days ago. 
Pasted:


I believe it can be solved using RDONLY and WRONLY too. Probably we solve it 
by not using the fifo at all.
Currently the blocking comes from the fact, that fast-import doesn't parse 
it's command line at startup. It rather reads an input line first and 

Need help to complete the proposed gsoc 2012 project

2012-07-31 Thread jaseem abid
Dear list,

Project : Use JavaScript library / framework in 
gitweb
Project Description : 
https://github.com/peff/git/wiki/SoC-2012-Ideas
Code: 
https://github.com/jaseemabid/git/commits/gitweb

The project was proposed by Jakub Narębski for GOSC 2012 but was not 
taken
up because git didn't get enough slots from Google. I almost completed the work
and now I am stuck at a stage where I cant move forward without some help from
the community. Jakub and Andrew Sayers used to help me but they are busy with
their own work and are not available.

The JavaScript in gitweb was re-implementing a lot of features which a 
lot
of common libraries did a several times already. For example, sending an Ajax
request or DOM manipulation required different code in different browsers. The
project aims to clean up the JavaScript using such a library. As of now, I have
ported all of the existing features to a cleaner version using JQuery, and I am
ready to work on more features if required.

I have also added tests for JavaScript which didn't exist before using 
mocha
BDD testing framework, which can be run in both console with node.js producing
TAP formatted output and in browsers. I have not introduced any UI changes. The
old CSS was used without modifications. Code quality was taken into good
account. All code is JSLint valid now.

Here is the detailed project status, todo, known bugs and commit notes :
https://gist.github.com/3218461

Any help will be greatly appreciated.

Regards,

Jaseem Abid
http://jaseemabid.github.com
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-31 Thread Florian Achleitner
On Monday 30 July 2012 11:55:02 Jonathan Nieder wrote:
 Florian Achleitner wrote:
  Hm .. that would mean, that both fast-import and git (transport-helper)
  would write to the remote-helper's stdin, right?
 
 Yes, first git writes the list of refs to import, and then fast-import
 writes feedback during the import.  Is that a problem?

I haven't tried that yet, nor do I remember anything where I've already seen 
two processes writing to the same pipe.
At least it sounds cumbersome to me. Processes' lifetimes overlap, so buffering 
and flushing could mix data.
We have to use it for both purposes interchangably  because there can be more 
than one import command to the remote-helper, of course.

Will try that in test-program..
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC] sane_execvp(): ignore non-directory on PATH

2012-07-31 Thread Junio C Hamano
When you have a non-directory on your PATH, a funny thing happens:

$ PATH=$PATH:/bin/sh git foo
fatal: cannot exec 'git-foo': Not a directory?

Worse yet, as real commands always take precedence over aliases,
this behaviour interacts rather badly with them:

$ PATH=$PATH:/bin/sh git -c alias.foo=show git foo -s
fatal: cannot exec 'git-foo': Not a directory?

This is because an ENOTDIR error from the underlying execvp(2) is
reported back to the caller of our sane_execvp() wrapper as-is.  By
translating it to ENOENT, just like the case where we _might_ have
the command in an unreadable directory, fixes it.  Without an alias,
we would get

git: 'foo' is not a git command. See 'git --help'.

and we use the 'foo' alias when it is available.

Signed-off-by: Junio C Hamano gits...@pobox.com
---

 * We can view this as a follow-up to 38f865c (run-command: treat
   inaccessible directories as ENOENT, 2012-03-30).

 run-command.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/run-command.c b/run-command.c
index 805d41f..f9b7db2 100644
--- a/run-command.c
+++ b/run-command.c
@@ -77,6 +77,8 @@ int sane_execvp(const char *file, char * const argv[])
 */
if (errno == EACCES  !strchr(file, '/'))
errno = exists_in_PATH(file) ? EACCES : ENOENT;
+   else if (errno == ENOTDIR  !strchr(file, '/'))
+   errno = ENOENT;
return -1;
 }
 
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] macos: lazily initialize iconv

2012-07-31 Thread Linus Torvalds
On Tue, Jul 31, 2012 at 11:37 AM, Junio C Hamano gits...@pobox.com wrote:

  * This is not even compile tested, so it needs testing and
benchmarking, as I do not even know how costly the calls to
open/close are when we do not have to call iconv() itself.

Ok, so it's easily compile-tested: just add

+   COMPAT_OBJS += compat/precompose_utf8.o
+   BASIC_CFLAGS += -DPRECOMPOSE_UNICODE

to the makefile for Linux too.

Actually testing how well it *works* is hard, since I don't really
have a mac (well, I do, but it no longer has OS X on it ;), and the
whole utf-8-mac thing does not make sense.

HOWEVER. I actually tested it with the conversion being from Latin1 to
UTF-8 instead, and it does interesting things, and kind of works. I
say kind of, because for the case of the filesystem being in Latin1,
we actually have to convert things back to the filesystem character
set when doing open() and lstat(), and this patch obviously
doesn't do that, because OS X does the conversion back to NFD on its
own.

But ACK on the patch.

If I had more time, I'd actually be interested to do the generic case
of namespace conversion, and we could make this a generic git feature
- it's something I wanted to do long ago. However, right now I'm in
the merge window and will go on a vacation to Finland after that, so I
probably won't get around to it.

I do have one suggestion: please rename the has_utf8() function to
has_nonascii() too. Why? Because that's what the function actually
does. It has nothing to do with testing for UTF-8 (the utf-8 rules are
more complex than just high bit set), and *if* I ever get around to
doing a more generic character set conversion for the filenames, the
decision really would be about non-ASCII, not about non-UTF8.

  Linus
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase -i: handle fixup of root commit correctly

2012-07-31 Thread Johannes Sixt

Am 31.07.2012 14:48, schrieb Chris Webb:

Chris Webbch...@arachsys.com  writes:


If we have a conflict in the middle of a chain of fixup/squashes, as far as
I can see, we have a HEAD with all the previous successful fixups applied,
conflict markers for the current failed pick, and when the conflict has been
resolved, git rebase --continue will commit --amend the resolution and
continue? Isn't that the correct behaviour here?


As an explicit test, I've just tried a chain of four squashed commits, each
of which deliberately resulted in a conflict to manually resolve. For each
squash, I was left with conflict markers on top of what had already been
squashed in the expected way, and when I continued after resolving these,
the resolution was 'commit --amend'ed in the expected way, with the same
behaviour and resulting commit at the end of the rebase -i as I get with a
copy of git without this patch.


OK, good. One subtlety to watch out for is when commit messages are 
edited. That is, if you edit the proposed message at 'rebase --continue' 
after the first squash failed, is the new text preserved until the last 
squash? I *think* that previously that was the case.


That said, I do appreciate the new modus operandi. The state when a rebase 
is interrupted is much clearer than earlier: now HEAD contains everything 
that was successfully replayed so far, and the index anything that failed.


-- Hannes
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] macos: lazily initialize iconv

2012-07-31 Thread Junio C Hamano
Linus Torvalds torva...@linux-foundation.org writes:

 On Tue, Jul 31, 2012 at 11:37 AM, Junio C Hamano gits...@pobox.com wrote:

  * This is not even compile tested, so it needs testing and
benchmarking, as I do not even know how costly the calls to
open/close are when we do not have to call iconv() itself.

 Ok, so it's easily compile-tested: just add

 +   COMPAT_OBJS += compat/precompose_utf8.o
 +   BASIC_CFLAGS += -DPRECOMPOSE_UNICODE

 to the makefile for Linux too.

 Actually testing how well it *works* is hard, since I don't really
 have a mac (well, I do, but it no longer has OS X on it ;), and the
 whole utf-8-mac thing does not make sense.

Also the motivation for this change (not the original utf-8-mac one,
which is not my code) is about not paying unnecessary iconv_open()
overhead when we do not have to, so the measurement has to happen on
Mac, not on Linux.

 HOWEVER. I actually tested it with the conversion being from Latin1 to
 UTF-8 instead, and it does interesting things, and kind of works. I
 say kind of, because for the case of the filesystem being in Latin1,
 we actually have to convert things back to the filesystem character
 set ...

Eek.

Not just write_entry() codepath that creates the final paths on the
filesystem, you would need to touch lstat() calls that check the
existence and freshness of the path, once you go that route.  I am
sure such a change can be made to work, but I am not sure how much
we would gain from one.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP PATCH] Manual rename correction

2012-07-31 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 A much better hint is to annotate pairs of sha1s, to say do not bother
 doing inexact rename correlation on this pair; I promise that they have
 value N.

Surely.  And I suspect that the patch to the current codebase to do
so would be much less impact if you go that way.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] grep: add a perlRegexp configuration option

2012-07-31 Thread J Smith
On Tue, Jul 31, 2012 at 2:04 PM, Junio C Hamano gits...@pobox.com wrote:

 Turning grep.extendedregexp from boolean to an extended boolean to
 allow grep.extendedregexp = perl might be a better alternative.
 That way, the user wouldn't have to worry about 7 variants of
 grep.fooRegexp variables twenty years down the road, even though the
 set of possible values given to grep.extendedregexp may have grown
 over time by then.

Yeah, that sounds good. I've re-written the patch to accommodate the
change allowing for the current boolean settings of true/false as well
as perl. For the sake of completeness (verbosity? pedantry?) I also
included a setting for extended which is equivalent to true.

With this sort of change, would a more generic grep.regexpMode,
grep.patternType or something similar perhaps be more descriptive,
with grep.extendedRegexp being aliased for backwards compatibility
purposes? I could also add that functionality if desired.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] sane_execvp(): ignore non-directory on PATH

2012-07-31 Thread Jeff King
On Tue, Jul 31, 2012 at 12:46:13PM -0700, Junio C Hamano wrote:

 When you have a non-directory on your PATH, a funny thing happens:
 
   $ PATH=$PATH:/bin/sh git foo
   fatal: cannot exec 'git-foo': Not a directory?
 
 Worse yet, as real commands always take precedence over aliases,
 this behaviour interacts rather badly with them:
 
   $ PATH=$PATH:/bin/sh git -c alias.foo=show git foo -s
   fatal: cannot exec 'git-foo': Not a directory?
 
 This is because an ENOTDIR error from the underlying execvp(2) is
 reported back to the caller of our sane_execvp() wrapper as-is.  By
 translating it to ENOENT, just like the case where we _might_ have
 the command in an unreadable directory, fixes it.  Without an alias,
 we would get
 
   git: 'foo' is not a git command. See 'git --help'.
 
 and we use the 'foo' alias when it is available.
 
 Signed-off-by: Junio C Hamano gits...@pobox.com
 ---
 
  * We can view this as a follow-up to 38f865c (run-command: treat
inaccessible directories as ENOENT, 2012-03-30).

Hrm. EACCES is somewhat special, in that the underlying execvp will
continue after seeing EACCES, and will only report it back to us if we
don't eventually find a good candidate.

Is ENOTDIR the same? IOW, If I do:

  PATH=/bin/cat:/bin
  ls

will I still run ls? Testing on my glibc system says yes, which I
think makes this a sane thing to do (if it were not the case and ENOTDIR
causes an early return, then that ENOENT is kind of a lie, since we
simply don't know the answer).

 diff --git a/run-command.c b/run-command.c
 index 805d41f..f9b7db2 100644
 --- a/run-command.c
 +++ b/run-command.c
 @@ -77,6 +77,8 @@ int sane_execvp(const char *file, char * const argv[])
*/
   if (errno == EACCES  !strchr(file, '/'))
   errno = exists_in_PATH(file) ? EACCES : ENOENT;
 + else if (errno == ENOTDIR  !strchr(file, '/'))
 + errno = ENOENT;
   return -1;

Yay. I remember the original 38f865c going through several iterations,
and I am glad we took the time to end up with one that made adding this
case in so simple.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] grep: add a perlRegexp configuration option

2012-07-31 Thread Junio C Hamano
J Smith dark.pa...@gmail.com writes:

 ... For the sake of completeness (verbosity? pedantry?) I also
 included a setting for extended which is equivalent to true.

Good thinking.

 With this sort of change, would a more generic grep.regexpMode,
 grep.patternType or something similar perhaps be more descriptive,
 with grep.extendedRegexp being aliased for backwards compatibility
 purposes? I could also add that functionality if desired.

A variable called extendedRegexp already reads quite naturally if
it can have value to say what kind of extendedness is desired, at
least to me.  So I do not care too deeply either way.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] grep: add a perlRegexp configuration option

2012-07-31 Thread J Smith
On Tue, Jul 31, 2012 at 4:30 PM, Junio C Hamano gits...@pobox.com wrote:
 J Smith dark.pa...@gmail.com writes:

 ... For the sake of completeness (verbosity? pedantry?) I also
 included a setting for extended which is equivalent to true.

 Good thinking.

 With this sort of change, would a more generic grep.regexpMode,
 grep.patternType or something similar perhaps be more descriptive,
 with grep.extendedRegexp being aliased for backwards compatibility
 purposes? I could also add that functionality if desired.

 A variable called extendedRegexp already reads quite naturally if
 it can have value to say what kind of extendedness is desired, at
 least to me.  So I do not care too deeply either way.

On the flip side, it might be useful to some to have the option to set
the value to fixed for the --fixed-strings argument, in which case
the option becomes less a type of extended regexp and more of a simple
search string. Were that to be the case, I think grep.patternType
would feel the most precise.

I think for completeness at the very least I should work in the
fixed value as an valid value, option naming aside.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] macos: lazily initialize iconv

2012-07-31 Thread Linus Torvalds
On Tue, Jul 31, 2012 at 1:16 PM, Junio C Hamano gits...@pobox.com wrote:

 Eek.

Oh, I agree. Doing a full character set conversion both ways is quite
a bit more work.

 Not just write_entry() codepath that creates the final paths on the
 filesystem, you would need to touch lstat() calls that check the
 existence and freshness of the path, once you go that route.  I am
 sure such a change can be made to work, but I am not sure how much
 we would gain from one.

I think it might be interesting. I doubt it matters all that much any
more in Western Europe (Unicode really does seem to have largely taken
over), but I think Japan still uses Shift-JIS a lot.

Although maybe that is starting to fade too.

And it really is just a generalization of the OS X filesystem damage.

Linus
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC] grep: add a perlRegexp configuration option

2012-07-31 Thread Junio C Hamano
J Smith dark.pa...@gmail.com writes:

 On Tue, Jul 31, 2012 at 4:30 PM, Junio C Hamano gits...@pobox.com wrote:
 J Smith dark.pa...@gmail.com writes:

 ... For the sake of completeness (verbosity? pedantry?) I also
 included a setting for extended which is equivalent to true.

 Good thinking.

 With this sort of change, would a more generic grep.regexpMode,
 grep.patternType or something similar perhaps be more descriptive,
 with grep.extendedRegexp being aliased for backwards compatibility
 purposes? I could also add that functionality if desired.

 A variable called extendedRegexp already reads quite naturally if
 it can have value to say what kind of extendedness is desired, at
 least to me.  So I do not care too deeply either way.

 On the flip side, it might be useful to some to have the option to set
 the value to fixed for the --fixed-strings argument, in which case
 the option becomes less a type of extended regexp and more of a simple
 search string. Were that to be the case, I think grep.patternType
 would feel the most precise.

 I think for completeness at the very least I should work in the
 fixed value as an valid value, option naming aside.

Ok, then grep.patternType it is.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC 1/4 v2] Implement a basic remote helper for svn in C.

2012-07-31 Thread Jonathan Nieder
Florian Achleitner wrote:

 I haven't tried that yet, nor do I remember anything where I've already seen
 two processes writing to the same pipe.

It's a perfectly normal and well supported thing to do.

[...]
 Will try that in test-program..

Thanks.

Good luck,
Jonathan
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] rebase -i: handle fixup of root commit correctly

2012-07-31 Thread Chris Webb
Johannes Sixt j...@kdbg.org writes:

 One subtlety to watch out for is when commit messages are edited. That is,
 if you edit the proposed message at 'rebase --continue' after the first
 squash failed, is the new text preserved until the last squash? I *think*
 that previously that was the case.

Hi. Yes, doing this seems to work fine both in the original code, and after
my patch. I've just checked to be certain using my previous test case of
four conflicting squashes again, editing the message at each stage and
ensuring the edits are all retained in the final commit.

Best wishes,

Chris.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH/RFC 1/2] grep: add basic, extended, fixed, and perl to grep.extendedRegexp

2012-07-31 Thread J Smith
Adds basic, extended, fixed, and perl settings to the
grep.extendedRegexp configuration option which set --basic-regexp,
--extended-regexp, --fixed-strings, and --perl-regexp options by
default respectively. For the purposes of backwards compatibility,
extended is equivalent to true.
---
 Documentation/config.txt   |   6 ++-
 Documentation/git-grep.txt |   6 ++-
 builtin/grep.c |  95 
 grep.h |   8 
 t/t7810-grep.sh| 105 +
 5 files changed, 180 insertions(+), 40 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index a95e5a4..67d9f24 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1211,7 +1211,11 @@ grep.lineNumber::
If set to true, enable '-n' option by default.

 grep.extendedRegexp::
-   If set to true, enable '--extended-regexp' option by default.
+   Sets the default matching behavior. This option can be set to a
+   boolean value or one of 'basic', 'extended', 'fixed', or 'perl'
+   which will enable the '--basic-regexp', '--extended-regexp',
+   '--fixed-strings' or '--perl-regexp' options accordingly. The value
+   of 'true' is equivalent to 'extended'.

 gpg.program::
Use this custom program instead of gpg found on $PATH when
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 3bec036..100328f 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -43,7 +43,11 @@ grep.lineNumber::
If set to true, enable '-n' option by default.

 grep.extendedRegexp::
-   If set to true, enable '--extended-regexp' option by default.
+   Sets the default matching behavior. This option can be set to a
+   boolean value or one of 'basic', 'extended', 'fixed', or 'perl'
+   which will enable the '--basic-regexp', '--extended-regexp',
+   '--fixed-strings' or '--perl-regexp' options accordingly. The value
+   of 'true' is equivalent to 'extended'.


 OPTIONS
diff --git a/builtin/grep.c b/builtin/grep.c
index 29adb0a..249fc7d 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -260,6 +260,55 @@ static int wait_all(void)
 }
 #endif

+static int parse_pattern_type_arg(const char *opt, const char *arg)
+{
+   switch (git_config_maybe_bool(opt, arg)) {
+   case 1:
+   return GREP_PATTERN_TYPE_ERE;
+   case 0:
+   return GREP_PATTERN_TYPE_UNSPECIFIED;
+   default:
+   if (!strcmp(arg, basic))
+   return GREP_PATTERN_TYPE_BRE;
+   else if (!strcmp(arg, extended))
+   return GREP_PATTERN_TYPE_ERE;
+   else if (!strcmp(arg, fixed))
+   return GREP_PATTERN_TYPE_FIXED;
+   else if (!strcmp(arg, perl))
+   return GREP_PATTERN_TYPE_PCRE;
+   die(bad %s argument: %s, opt, arg);
+   }
+}
+
+static void grep_pattern_type_options(const int pattern_type, void *cb)
+{
+   struct grep_opt *opt = cb;
+
+   switch (pattern_type) {
+   case GREP_PATTERN_TYPE_BRE:
+   opt-fixed = 0;
+   opt-pcre = 0;
+   opt-regflags = ~REG_EXTENDED;
+   break;
+
+   case GREP_PATTERN_TYPE_ERE:
+   opt-fixed = 0;
+   opt-pcre = 0;
+   opt-regflags |= REG_EXTENDED;
+   break;
+
+   case GREP_PATTERN_TYPE_FIXED:
+   opt-fixed = 1;
+   opt-pcre = 0;
+   break;
+
+   case GREP_PATTERN_TYPE_PCRE:
+   opt-fixed = 0;
+   opt-pcre = 1;
+   break;
+   }
+}
+
 static int grep_config(const char *var, const char *value, void *cb)
 {
struct grep_opt *opt = cb;
@@ -269,10 +318,7 @@ static int grep_config(const char *var, const char *value, 
void *cb)
return -1;

if (!strcmp(var, grep.extendedregexp)) {
-   if (git_config_bool(var, value))
-   opt-regflags |= REG_EXTENDED;
-   else
-   opt-regflags = ~REG_EXTENDED;
+   grep_pattern_type_options(parse_pattern_type_arg(var, value), 
opt);
return 0;
}

@@ -669,14 +715,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
int i;
int dummy;
int use_index = 1;
-   enum {
-   pattern_type_unspecified = 0,
-   pattern_type_bre,
-   pattern_type_ere,
-   pattern_type_fixed,
-   pattern_type_pcre,
-   };
-   int pattern_type = pattern_type_unspecified;
+   int pattern_type = GREP_PATTERN_TYPE_UNSPECIFIED;

struct option options[] = {
OPT_BOOLEAN(0, cached, 

[PATCH/RFC 2/2] grep: rename grep.extendedRegexp option to grep.patternType

2012-07-31 Thread J Smith
With the addition of the basic, extended, fixed, and perl
values for the grep.extendedRegexp option the name grep.patternType
better represents the option's functionality. grep.extendedRegexp
remains available as an alias to grep.patternType for the purposes of
backwards compatibility.
---
 Documentation/config.txt   |  5 ++-
 Documentation/git-grep.txt |  5 ++-
 builtin/grep.c |  4 ++-
 t/t7810-grep.sh| 80 ++
 4 files changed, 56 insertions(+), 38 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 67d9f24..9644bba 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -1210,13 +1210,16 @@ gitweb.snapshot::
 grep.lineNumber::
If set to true, enable '-n' option by default.

-grep.extendedRegexp::
+grep.patternType::
Sets the default matching behavior. This option can be set to a
boolean value or one of 'basic', 'extended', 'fixed', or 'perl'
which will enable the '--basic-regexp', '--extended-regexp',
'--fixed-strings' or '--perl-regexp' options accordingly. The value
of 'true' is equivalent to 'extended'.

+grep.extendedRegexp::
+   Alias for grep.patternType.
+
 gpg.program::
Use this custom program instead of gpg found on $PATH when
making or verifying a PGP signature. The program must support the
diff --git a/Documentation/git-grep.txt b/Documentation/git-grep.txt
index 100328f..d51cc19 100644
--- a/Documentation/git-grep.txt
+++ b/Documentation/git-grep.txt
@@ -42,13 +42,16 @@ CONFIGURATION
 grep.lineNumber::
If set to true, enable '-n' option by default.

-grep.extendedRegexp::
+grep.patternType::
Sets the default matching behavior. This option can be set to a
boolean value or one of 'basic', 'extended', 'fixed', or 'perl'
which will enable the '--basic-regexp', '--extended-regexp',
'--fixed-strings' or '--perl-regexp' options accordingly. The value
of 'true' is equivalent to 'extended'.

+grep.extendedRegexp::
+   Alias for grep.patternType.
+

 OPTIONS
 ---
diff --git a/builtin/grep.c b/builtin/grep.c
index 249fc7d..a8c1c32 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -317,7 +317,9 @@ static int grep_config(const char *var, const char *value, 
void *cb)
if (userdiff_config(var, value)  0)
return -1;

-   if (!strcmp(var, grep.extendedregexp)) {
+   if (!strcmp(var, grep.patterntype) ||
+   /* for backwards compatibility */
+   !strcmp(var, grep.extendedregexp)) {
grep_pattern_type_options(parse_pattern_type_arg(var, value), 
opt);
return 0;
}
diff --git a/t/t7810-grep.sh b/t/t7810-grep.sh
index c21cd61..6bfe368 100755
--- a/t/t7810-grep.sh
+++ b/t/t7810-grep.sh
@@ -239,32 +239,32 @@ do
git grep --max-depth 0 -n -e vvv $H -- t . actual 
test_cmp expected actual
'
-   test_expect_success grep $L with grep.extendedRegexp=false '
+   test_expect_success grep $L with grep.patternType=false '
echo ab:a+bc expected 
-   git -c grep.extendedRegexp=false grep a+b*c ab actual 
+   git -c grep.patterntype=false grep a+b*c ab actual 
test_cmp expected actual
'

-   test_expect_success grep $L with grep.extendedRegexp=true '
+   test_expect_success grep $L with grep.patternType=true '
echo ab:abc expected 
-   git -c grep.extendedRegexp=true grep a+b*c ab actual 
+   git -c grep.patterntype=true grep a+b*c ab actual 
test_cmp expected actual
'

-   test_expect_success grep $L with grep.extendedRegexp=extended '
+   test_expect_success grep $L with grep.patternType=extended '
echo ab:abc expected 
-   git -c grep.extendedregexp=extended grep a+b*c ab actual 
+   git -c grep.patterntype=extended grep a+b*c ab actual 
test_cmp expected actual
'

-   test_expect_success grep $L with grep.extendedRegexp=fixed '
+   test_expect_success grep $L with grep.patternType=fixed '
echo ab:abc expected 
-   git -c grep.extendedregexp=fixed grep ab ab actual 
+   git -c grep.patterntype=fixed grep ab ab actual 
test_cmp expected actual
'

-   test_expect_success grep $L with a valid regexp and 
grep.extendedRegexp=fixed  '
-   test_must_fail git -c grep.extendedregexp=fixed grep a* ab
+   test_expect_success grep $L with a valid regexp and 
grep.patternType=fixed  '
+   test_must_fail git -c grep.patterntype=fixed grep a* ab
'

test_expect_success grep $L with grep.extendedRegexp=basic '
@@ -748,91 +748,91 @@ test_expect_success LIBPCRE 'grep -P pattern' '
test_cmp expected actual
 '


Re: [PATCH/RFC] grep: add a perlRegexp configuration option

2012-07-31 Thread J Smith
On Tue, Jul 31, 2012 at 5:05 PM, Junio C Hamano gits...@pobox.com wrote:

 Ok, then grep.patternType it is.

 Thanks.

Cool, patches should be on their way. I added options for basic,
extended, fixed and perl for completeness along with the name
change with a BC alias patch separately for perusal.

Cheers.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix git-svn for SVN 1.7

2012-07-31 Thread Junio C Hamano
Eric Wong normalper...@yhbt.net writes:

 Michael G Schwern schw...@pobox.com wrote:
 It just doesn't matter.
 
 Why are we arguing over which solution will be 4% better two years from now,
 or if my commits are formatted perfectly, when tremendous amounts of basic
 work to be done improving git-svn?  The code is undocumented, lacking unit
 tests, difficult to understand and riddled with bugs.

 Yes it does matter.

 git-svn has the problems it has because it traditionally had lower
 review standards than the rest of git.  So yes, we're being more careful
 nowadays about the long-term ramifications of changes.

Thanks.  I know it takes guts to publicly admit that over time your
own creation has become less ideal than you wish it to be, but it
needed to be said.

Michael, please realize that the only reason people comment on the
patch series is because they care about what the series brings to
us.  In other words, your effort is appreciated.  For a change that
we want to have in our codebase, the functionality of the code
immediately after the change is applied of course is important, but
the maintainability of the result also matters.

We want to make sure that anybody who wants to understand and
improve the system can read the code without distraction from
inconsistent coding styles used in different sections of code.  We
want git log (or git log git-svn.perl perl/) output to tell a
coherent story about how the code evolved and why these changes are
made in a consistent voice to the readers.  We want people to be
able to git log | grep Signed-off-by: to count the contributors.

A contributor has enough room to be creative in how his or her code
is designed.  Updating the code to follow the convert as early as
possible, and (during subsequent discussion with Eric) suggesting
use of class instances instead of bare strings to make it harder to
mistakenly use bare unconverted strings are two examples you already
showed creativity in areas that matter.

There is no need to be creative in ChangeLog and coding styles; it
only hurts maintainability.

Regarding the operator overloading of eq for comparing the
converted strings, I still think it will hurt maintainablity (we
want to make sure that it is harder, not easier, to make wrong
changes to the code in the future), but I may be mistaken and you
may have better ideas.  If you can use overloading in such a way
that it won't harm maintainability and yet makes the resulting code
easier to read, I don't have any objection.

What I won't accept is maintainability does not matter.  It does.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH/RFC 2/2] grep: rename grep.extendedRegexp option to grep.patternType

2012-07-31 Thread Junio C Hamano
J Smith dark.pa...@gmail.com writes:

 With the addition of the basic, extended, fixed, and perl
 values for the grep.extendedRegexp option the name grep.patternType
 better represents the option's functionality. grep.extendedRegexp
 remains available as an alias to grep.patternType for the purposes of
 backwards compatibility.
 ---

Sorry for not bringing this up earlier when we discussed grep.patternType,
but my preference would be to introduce grep.patternType with these
type names (including basic and perl) from the beginning, and then
ignore grep.extendedRegexp if grep.patternType is set.

The core part of the change may look something like this...

diff --git a/builtin/grep.c b/builtin/grep.c
index 29adb0a..260a7db 100644
--- a/builtin/grep.c
+++ b/builtin/grep.c
@@ -260,6 +260,57 @@ static int wait_all(void)
 }
 #endif
 
+static int parse_pattern_type_arg(const char *opt, const char *arg)
+{
+   switch (git_config_maybe_bool(opt, arg)) {
+   case 1:
+   return GREP_PATTERN_TYPE_ERE;
+   case 0:
+   return GREP_PATTERN_TYPE_UNSPECIFIED;
+   default:
+   if (!strcmp(arg, basic))
+   return GREP_PATTERN_TYPE_BRE;
+   else if (!strcmp(arg, extended))
+   return GREP_PATTERN_TYPE_ERE;
+   else if (!strcmp(arg, fixed))
+   return GREP_PATTERN_TYPE_FIXED;
+   else if (!strcmp(arg, perl))
+   return GREP_PATTERN_TYPE_PCRE;
+   die(bad %s argument: %s, opt, arg);
+   }
+}
+
+static void grep_pattern_type_options(const int pattern_type, void *cb)
+{
+   struct grep_opt *opt = cb;
+
+   switch (pattern_type) {
+   case GREP_PATTERN_TYPE_BRE:
+   opt-fixed = 0;
+   opt-pcre = 0;
+   opt-regflags = ~REG_EXTENDED;
+   break;
+
+   case GREP_PATTERN_TYPE_ERE:
+   opt-fixed = 0;
+   opt-pcre = 0;
+   opt-regflags |= REG_EXTENDED;
+   break;
+
+   case GREP_PATTERN_TYPE_FIXED:
+   opt-fixed = 1;
+   opt-pcre = 0;
+   opt-regflags = ~REG_EXTENDED;
+   break;
+
+   case GREP_PATTERN_TYPE_PCRE:
+   opt-fixed = 0;
+   opt-pcre = 1;
+   opt-regflags = ~REG_EXTENDED;
+   break;
+   }
+}
+
 static int grep_config(const char *var, const char *value, void *cb)
 {
struct grep_opt *opt = cb;
@@ -269,10 +320,18 @@ static int grep_config(const char *var, const char 
*value, void *cb)
return -1;
 
if (!strcmp(var, grep.extendedregexp)) {
-   if (git_config_bool(var, value))
-   opt-regflags |= REG_EXTENDED;
-   else
-   opt-regflags = ~REG_EXTENDED;
+   if (!opt-pattern_type_used) {
+   if (git_config_bool(var, value))
+   opt-regflags |= REG_EXTENDED;
+   else
+   opt-regflags = ~REG_EXTENDED;
+   }
+   return 0;
+   }
+
+   if (!strcmp(var, grep.patterntype)) {
+   grep_pattern_type_options(parse_pattern_type_arg(var, value), 
opt);
+   opt-pattern_type_used = 1;
return 0;
}
 
@@ -669,14 +728,7 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
int i;
int dummy;
int use_index = 1;
-   enum {
-   pattern_type_unspecified = 0,
-   pattern_type_bre,
-   pattern_type_ere,
-   pattern_type_fixed,
-   pattern_type_pcre,
-   };
-   int pattern_type = pattern_type_unspecified;
+   int pattern_type = GREP_PATTERN_TYPE_UNSPECIFIED;
 
struct option options[] = {
OPT_BOOLEAN(0, cached, cached,
@@ -705,16 +757,16 @@ int cmd_grep(int argc, const char **argv, const char 
*prefix)
OPT_GROUP(),
OPT_SET_INT('E', extended-regexp, pattern_type,
use extended POSIX regular expressions,
-   pattern_type_ere),
+   GREP_PATTERN_TYPE_ERE),
OPT_SET_INT('G', basic-regexp, pattern_type,
use basic POSIX regular expressions (default),
-   pattern_type_bre),
+   GREP_PATTERN_TYPE_BRE),
OPT_SET_INT('F', fixed-strings, pattern_type,
interpret patterns as fixed strings,
-   pattern_type_fixed),
+   GREP_PATTERN_TYPE_FIXED),
OPT_SET_INT('P', perl-regexp, pattern_type,
use Perl-compatible regular expressions,
-   pattern_type_pcre),
+   GREP_PATTERN_TYPE_PCRE),
   

Re: Fix git-svn for SVN 1.7

2012-07-31 Thread Michael G Schwern
On 2012.7.31 1:01 PM, Eric Wong wrote:
 Michael G Schwern schw...@pobox.com wrote:
 It just doesn't matter.

 Why are we arguing over which solution will be 4% better two years from now,
 or if my commits are formatted perfectly, when tremendous amounts of basic
 work to be done improving git-svn?  The code is undocumented, lacking unit
 tests, difficult to understand and riddled with bugs.
 
 Yes it does matter.
 
 git-svn has the problems it has because it traditionally had lower
 review standards than the rest of git.  So yes, we're being more careful
 nowadays about the long-term ramifications of changes.

Yes, review does matter.  And so far we've been arguing over whether reviewing
objects-with-overloading or objects-without-overloading would be better.  And
we can argue about that forever.

That's the part that doesn't matter.  People matter.

I think we can all agree that either solution is a vast improvement along
multiple axes, including review.  So what really matters is making sure one of
them gets done.  Once either of them is done, we can see how it works out in
practice instead of arguing theoretical futures.  Once either of them is done,
it's much easier to switch to the other.

What I'm trying to say is I have much less interest in doing it without the
overloading.  It's not interesting to me.  It's no fun.  No fun means no
patch.  No patch means no improvement.  No improvement is the worst of all
possible options.

I had a lot of enthusiasm for this project when I came in.  I like refactoring
Perl code.  I like git.  That's all but sunk at how painful and slow and
nit-picking the process has been.  We've barely talked about the content of
the patches I've submitted, it's all process.  This is no fun.

We're all volunteers here and we're all getting something personal out of
this.  Some form of personal enjoyment.  I'm not getting that, so I'm unlikely
to stick around.


-- 
Defender of Lexical Encapsulation
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Fix git-svn for SVN 1.7

2012-07-31 Thread Michael G Schwern
On 2012.7.31 4:05 PM, Junio C Hamano wrote:
 What I won't accept is maintainability does not matter.  It does.

I'm sorry, that's not what I intended to convey at all.  My reply to Eric lays
it out more clearly, I think.


-- 
Reality is that which, when you stop believing in it, doesn't go away.
-- Phillip K. Dick
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Broken git-svn tests known?

2012-07-31 Thread Ammon Riley
Hi,

On a freshly checked out copy of the maint branch (0e4c8822), the
t9100-git-svn-basic.sh tests are failing 21 of 25 tests. Is this
known, or am I missing some dependencies? Is it possibly due to
using subversion 1.7?

I've run into a small bug with git-svn, and wanted to make sure
the test suite still passed with my patch applied.

Cheers,
Ammon
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP PATCH] Manual rename correction

2012-07-31 Thread Jeff King
On Tue, Jul 31, 2012 at 01:20:42PM -0700, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  A much better hint is to annotate pairs of sha1s, to say do not bother
  doing inexact rename correlation on this pair; I promise that they have
  value N.
 
 Surely.  And I suspect that the patch to the current codebase to do
 so would be much less impact if you go that way.

Yes. You may remember I wrote a generic caching subsystem last summer
when we were talking about caching commit generations. Plugging in a new
map type to map sha1 pairs to 32-bit integers was pretty simple, and
that gives the basis for a rename cache.

It's fairly unimpressive on git.git. My best-of-five for git log
--format=%H --raw -M went from 5.83s to 5.74s, which is pretty much
within the run-to-run noise. The resulting cache was 155K.

However, it's easy to come up with much more pathological cases. I have
a really ugly rename-and-tweak-tags commit on my photo repository, and
those blobs are relatively big. My timings for git show on that were:

  before: 49.724s
  after, run 1: 54.904s
  after, run 2:  0.117s

Which is pretty darn nice. The resulting cache is 5.3M (the repository
itself is in the gigabytes, but that's not really relevant; the cache
will obviously scale with the number of paths, not with the size of the
blobs).

It would also work for copies, too, of course. Here are the results of
git log --format=%H --raw -M -C -C on git.git:

  before: 1m35s
  after, run 1: 39.7s
  after, run 2: 39.5s

So it does make much more of a difference for copies, which is obvious;
git is doing a lot more work for us to cache. At the same time, our
cache is much bigger: 32M. Yikes.

My cache is fairly naive, in that it literally stores 44 bytes of
src_sha1, dst_sha1, score for each entry. At the cost of more
complexity, you could store each src_sha1 once, followed by a set of
dst_sha1, score pairs. I also didn't take any special care to avoid
duplicates of X, Y and Y, X (since presumably these renames would be
commutative). I'm not sure it is necessary, though; I think the copy
machinery already suppresses this when entries are in both source and
destination lists.

So I don't know. It can definitely speed up some operations, but at the
cost of a non-trivial cache on disk. I'll spare you all of the generic
caching infrastructure, but the actual patch to rename looks like this
(just to give a sense of where the hooks go):

diff --git a/diffcore-rename.c b/diffcore-rename.c
index 216a7a4..db70878 100644
--- a/diffcore-rename.c
+++ b/diffcore-rename.c
@@ -6,6 +6,7 @@
 #include diffcore.h
 #include hash.h
 #include progress.h
+#include metadata-cache.h
 
 /* Table of rename/copy destinations */
 
@@ -137,7 +138,8 @@ static int estimate_similarity(struct diff_filespec *src,
 */
unsigned long max_size, delta_size, base_size, src_copied, 
literal_added;
unsigned long delta_limit;
-   int score;
+   uint32_t score;
+   struct sha1pair pair;
 
/* We deal only with regular files.  Symlink renames are handled
 * only when they are exact matches --- in other words, no edits
@@ -175,6 +177,11 @@ static int estimate_similarity(struct diff_filespec *src,
if (max_size * (MAX_SCORE-minimum_score)  delta_size * MAX_SCORE)
return 0;
 
+   hashcpy(pair.one, src-sha1);
+   hashcpy(pair.two, dst-sha1);
+   if (rename_cache_get(pair, score))
+   return score;
+
if (!src-cnt_data  diff_populate_filespec(src, 0))
return 0;
if (!dst-cnt_data  diff_populate_filespec(dst, 0))
@@ -195,6 +202,7 @@ static int estimate_similarity(struct diff_filespec *src,
score = 0; /* should not happen */
else
score = (int)(src_copied * MAX_SCORE / max_size);
+   rename_cache_set(pair, score);
return score;
 }
 
-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP PATCH] Manual rename correction

2012-07-31 Thread Nguyen Thai Ngoc Duy
On Wed, Aug 1, 2012 at 2:23 AM, Jeff King p...@peff.net wrote:
 It is a good direction to go in, I would think, to give users a way
 to explicitly tell that in comparison between these two trees, I
 know path B in the postimage corresponds to path A in the preimage.

 I do not think that is the right direction. Let's imagine that I have a
 commit A and I annotate it (via notes or whatever) to say between
 A^^{tree} and A^{tree}, foo.c became bar.c. That will help me when
 doing git show or git log. But it will not help me when I later try
 to merge A (or its descendent). In that case, I will compute the diff
 between A and the merge-base (or worse, some descendent of A and the
 merge-base), and I will miss this hint entirely.

 A much better hint is to annotate pairs of sha1s, to say do not bother
 doing inexact rename correlation on this pair; I promise that they have
 value N.

I haven't had time to think it through yet but I throw my thoughts in
any way. I actually went with your approach first. But it's more
difficult to control the renaming. Assume we want to tell git to
rename SHA-1 A to SHA-1 B. What happens if we have two As in the
source tree and two Bs in the target tree? What happens if two As and
one B, or one A and two Bs? What if a user defines A - B and A - C,
and we happen to have two As in source tree and B and C in target
tree?

There's also the problem with transferring this information. With
git-notes I think I can transfer it (though not automatically). How do
we transfer sha1 map (that you mentioned in the commit generation mail
in this thread)?

 Then it will find that pair no matter which trees or commits
 are being diffed, and it will do so relatively inexpensively[1].

But does that happen often in practice? I mean diff-ing two arbitrary
trees and expect rename correction. I disregarded it as git log is
my main case, but I'm just a single user..
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [WIP PATCH] Manual rename correction

2012-07-31 Thread Nguyen Thai Ngoc Duy
On Wed, Aug 1, 2012 at 9:01 AM, Jeff King p...@peff.net wrote:
 On Wed, Aug 01, 2012 at 08:10:12AM +0700, Nguyen Thai Ngoc Duy wrote:

  I do not think that is the right direction. Let's imagine that I have a
  commit A and I annotate it (via notes or whatever) to say between
  A^^{tree} and A^{tree}, foo.c became bar.c. That will help me when
  doing git show or git log. But it will not help me when I later try
  to merge A (or its descendent). In that case, I will compute the diff
  between A and the merge-base (or worse, some descendent of A and the
  merge-base), and I will miss this hint entirely.
 
  A much better hint is to annotate pairs of sha1s, to say do not bother
  doing inexact rename correlation on this pair; I promise that they have
  value N.

 I haven't had time to think it through yet but I throw my thoughts in
 any way. I actually went with your approach first. But it's more
 difficult to control the renaming. Assume we want to tell git to
 rename SHA-1 A to SHA-1 B. What happens if we have two As in the
 source tree and two Bs in the target tree? What happens if two As and
 one B, or one A and two Bs? What if a user defines A - B and A - C,
 and we happen to have two As in source tree and B and C in target
 tree?

 Yes, it disregards path totally. But if you had the exact same movement
 of content from one path to another in one instance, and it is
 considered a rename, wouldn't it also be a rename in a second instance?

Yes. This is probably cosmetics only, but without path information, we
leave it to chance to decide which A to pair with B and C (in the
A-B, A-C case above). Wrong path might lead to funny effects (i'm
thinking of git log --follow).

 There's also the problem with transferring this information. With
 git-notes I think I can transfer it (though not automatically). How do
 we transfer sha1 map (that you mentioned in the commit generation mail
 in this thread)?

I wasn't clear. This is about transferring info across repositories.

 That is orthogonal to the issue of what is being stored. I chose my
 mmap'd disk implementation because it is very fast, which makes it nice
 for a performance cache. But you could store the same thing in git-notes
 (indexed by dst sha1, I guess, and then pointing to a blob of (src,
 score) pairs.

 If you want to include path-based hints in a commit, I'd say that using
 some micro-format in the commit message would be the simplest thing.

Rename correction is after the commit is created. I don't think we can
recreate commits.

 But
 that has been discussed before; ultimately the problem is that it only
 covers _one_ diff that we do with that commit (it is probably the most
 common, of course, but it doesn't cover them all).

How about we generate sha1 mapping from commit hints? We try to take
advantage of path hints when we can. Else we fall back to sha-1
mapping. This way we can transfer commit hints as git-notes to another
repo, then regenerate sha-1 mapping there. No need to transfer sha1
maps.

  Then it will find that pair no matter which trees or commits
  are being diffed, and it will do so relatively inexpensively[1].

 But does that happen often in practice? I mean diff-ing two arbitrary
 trees and expect rename correction. I disregarded it as git log is
 my main case, but I'm just a single user..

 It happens every time merge-recursive does rename detection, which
 includes git merge but also things like cherry-pick.

Thanks. I'll look into merge/cherry-pick.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html