Re: What's cooking in git.git (Oct 2013, #04; Fri, 18)

2013-10-19 Thread Felipe Contreras
Jeff King wrote:
> On Fri, Oct 18, 2013 at 03:14:49PM -0700, Junio C Hamano wrote:
> 
> > * jn/add-2.0-u-A-sans-pathspec (2013-04-26) 1 commit
> >  - git add: -u/-A now affects the entire working tree
> > 
> >  Will cook in 'next' until Git 2.0.
> > 
> > 
> > * jc/core-checkstat-2.0 (2013-05-06) 1 commit
> >  - core.statinfo: remove as promised in Git 2.0
> > 
> >  Will cook in 'next' until Git 2.0.
> > 
> > 
> > * jc/push-2.0-default-to-simple (2013-06-18) 1 commit
> >  - push: switch default from "matching" to "simple"
> > 
> >  Will cook in 'next' until Git 2.0.
> > 
> > 
> > * jc/add-2.0-ignore-removal (2013-04-22) 1 commit
> >  - git add ... defaults to "-A"
> > 
> >  Updated endgame for "git add " that defaults to "--all"
> >  aka "--no-ignore-removal".
> > 
> >  Will cook in 'next' until Git 2.0.
> 
> I notice that these are not actually in 'next', despite the
> descriptions.  Should they be, to give them wider exposure?

I say they shouldn't be, unless the next version after 1.8 is 2.0. There should
be a separate branch for 2.0.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


separate-git-dir doesn't work with mapped drive

2013-10-19 Thread Ain Valtin
Hi

I want to use git in a VirtualBox guest so that the repository is on
the host drive. So in the VB settings for the guest I set up a shared
folder "gitRepos" to /home/ain with full access rights. Then in the
guest OS (Windows XP) I map this shared folder as G drive. Now in the
project dir I execute

C:\...\TPP>git init --separate-git-dir g:/TPP
Initialized empty Git repository in g:/TPP/

Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can
write to the mapped dir) and in the .git file created to the project
dir there is line

gitdir: g:/TPP

However when tring to use the repo it fails to recognise the g:/TPP path, ie

C:\...\TPP>git add .
fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument

Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too:

C:\...\TPP>git add .
fatal: Unable to create 'C:/Documents and
Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock':
No such file or directory

Am I doing something wrong or is it a bug? Any idea how to get it to work?

BTW the VB is 3.0.14 ie rather old version but it seems that this
isn't the problem as the git init recognises the mapped drive but
other commands fail.
git version is 1.8.4.msysgit.0


TIA
ain
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: separate-git-dir doesn't work with mapped drive

2013-10-19 Thread Duy Nguyen
There seems to be some regression fixes regarding dos drives. The one
that caught my eyes is 7fbd422 (relative_path should honor
dos-drive-prefix - 2013-10-14) but it's not released yet. And I'm not
sure if msys branch picks it up yet even if you want to rebuild and
test it yourself. Copying Jiang Xin, maybe he can tell if that commit
should fix what you describe here, or it's a new bug.
-- 
Duy

On Sat, Oct 19, 2013 at 3:49 PM, Ain Valtin  wrote:
> Hi
>
> I want to use git in a VirtualBox guest so that the repository is on
> the host drive. So in the VB settings for the guest I set up a shared
> folder "gitRepos" to /home/ain with full access rights. Then in the
> guest OS (Windows XP) I map this shared folder as G drive. Now in the
> project dir I execute
>
> C:\...\TPP>git init --separate-git-dir g:/TPP
> Initialized empty Git repository in g:/TPP/
>
> Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can
> write to the mapped dir) and in the .git file created to the project
> dir there is line
>
> gitdir: g:/TPP
>
> However when tring to use the repo it fails to recognise the g:/TPP path, ie
>
> C:\...\TPP>git add .
> fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument
>
> Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too:
>
> C:\...\TPP>git add .
> fatal: Unable to create 'C:/Documents and
> Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock':
> No such file or directory
>
> Am I doing something wrong or is it a bug? Any idea how to get it to work?
>
> BTW the VB is 3.0.14 ie rather old version but it seems that this
> isn't the problem as the git init recognises the mapped drive but
> other commands fail.
> git version is 1.8.4.msysgit.0
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack corruption post-mortem

2013-10-19 Thread Duy Nguyen
On Wed, Oct 16, 2013 at 3:34 PM, Jeff King  wrote:
> I was recently presented with a repository with a corrupted packfile,
> and was asked if the data was recoverable. This post-mortem describes
> the steps I took to investigate and fix the problem. I thought others
> might find the process interesting, and it might help somebody in the
> same situation.
>
> I started with an fsck, which found a problem with exactly one object
> (I've used $pack and $obj below to keep the output readable, and also
> because I'll refer to them later):
>
> $ git fsck
> error: $pack SHA1 checksum mismatch
> error: index CRC mismatch for object $obj from $pack at offset 51653873
> error: inflate: data stream error (incorrect data check)
> error: cannot unpack $obj from $pack at offset 51653873

I wonder if we should protect the sha-1 and pathname tables in packv4
with CRC too. A bit flipped in there could cause stream of corrupt
objects and make it hard to pinpoint the corrupt location..
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSoC 2014: Summary so far, discussion starter: how to improve?

2013-10-19 Thread Christian Couder
Hi,

On Sat, Oct 19, 2013 at 8:09 AM, Thomas Rast  wrote:
>
> Previous Episodes
> =
>
> Git participated in Google Summer of Code (GSoC) 2007-2012, but did not
> participate in 2013 based on discussion in February [1].  At Git-Merge
> in Berlin there was a discussion round [2] that this post attempts to
> summarize as a basis for further discussion and (hopefully)
> participation in GSoC'14.
>
> Much sooner than in previous years, Google has already confirmed that
> GSoC'14 will in fact happen [3].
>
> Note that while mistakes herein are mine, many ideas and opinions
> aren't.  This really tries to be a summary.  Please flame >/dev/null,
> not me.

Thank you for sending this very good summary.

> Theories
> 
>
> These are the hypotheses that I have heard (mostly in [1] and [2]) as
> to what is bad about Git's prior GSoC participations.
>
> * Aiming far too high, focusing on cool/shiny projects with a large
>   impact.  This also affects the students, who tend to cluster around
>   the largest, shiniest project suggestions.

Yeah, focusing on _too big_ projects.

> * Diminishing returns: Git is too mature, with little low-hanging
>   fruit left, making such projects harder
>
> * Projects are too political, progress depending on non-technical
>   arguments
>
> * Our mentors suck on various axes, notably not supporting students
>   enough in things that matter:
>   - smooth interaction with community
>   - ensure fast iteration/short cycles
>   - navigating the code base

Yeah, "fast iteration/short cycles" means submitting the work early
and often _to the mailing list_.

One of the thing we learned too is that focusing on selecting the
"best" students didn't really worked.
What worked was selecting students who have already contributed
patches, but unfortunately there are not many potential students who
have already contributed.

> Further steps
> =
>
> * Discuss :-)
>
>   And then apply this hard-won knowledge systematically.  In
>   particular I think it wouldn't hurt to keep the project proposals
>   out of this thread until we have agreed on goals/standards to
>   measure them against.

Based on the above, what about the following:

1) Advertise early and widely that we will very strongly favor
students who have already contributed and that we can help them
contribute long before their application process starts. Maybe we
could even have a pre GSoC application process for potential students.

2) Advertise that we will really favor small projects over big/shiny ones.

3) Come up with a list of criteria like the following to measure
student/application:

- has the student already contributed much: 0-30 points:
0 means no contribution to any open source project,
5 means some contribution to another open source project,
20 means long time Git contributor

- what is the size of the project: 0-10 points:
0 means big project (one month or more for a regular contributor)
10 means small project (one week for a regular contributor)

- does the student appear willing to act on advice: 0-5 points

- does the student appear to have the necessary skills: 0-5 points

- does the proposal look good: 0-5 points

> * Gather a list of potential mentors.

I would be happy to mentor next year.

Thanks,
Christian.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Prevent buffer overflows when path is too big

2013-10-19 Thread Antoine Pelisse
Currently, most buffers created with PATH_MAX length, are not checked
when being written, and can overflow if PATH_MAX is not big enough to
hold the path.

Fix that by using strlcpy() where strcpy() was used, and also run some
extra checks when copy is done with memcpy().

Reported-by: Wataru Noguchi 
Signed-off-by: Antoine Pelisse 
---
 abspath.c| 10 +++---
 diffcore-order.c |  2 +-
 entry.c  | 14 ++
 unpack-trees.c   |  2 ++
 4 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/abspath.c b/abspath.c
index 64adbe2..0e60ba4 100644
--- a/abspath.c
+++ b/abspath.c
@@ -216,11 +216,15 @@ const char *absolute_path(const char *path)
 const char *prefix_filename(const char *pfx, int pfx_len, const char *arg)
 {
static char path[PATH_MAX];
+
+   if (pfx_len > PATH_MAX)
+   die("Too long prefix path: %s", pfx);
+
 #ifndef GIT_WINDOWS_NATIVE
if (!pfx_len || is_absolute_path(arg))
return arg;
memcpy(path, pfx, pfx_len);
-   strcpy(path + pfx_len, arg);
+   strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len);
 #else
char *p;
/* don't add prefix to absolute paths, but still replace '\' by '/' */
@@ -228,8 +232,8 @@ const char *prefix_filename(const char *pfx, int pfx_len, 
const char *arg)
pfx_len = 0;
else if (pfx_len)
memcpy(path, pfx, pfx_len);
-   strcpy(path + pfx_len, arg);
-   for (p = path + pfx_len; *p; p++)
+   strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len);
+   for (p = path + pfx_len; p < path + PATH_MAX && *p; p++)
if (*p == '\\')
*p = '/';
 #endif
diff --git a/diffcore-order.c b/diffcore-order.c
index 23e9385..f083c82 100644
--- a/diffcore-order.c
+++ b/diffcore-order.c
@@ -76,7 +76,7 @@ static int match_order(const char *path)
char p[PATH_MAX];
 
for (i = 0; i < order_cnt; i++) {
-   strcpy(p, path);
+   strlcpy(p, path, PATH_MAX);
while (p[0]) {
char *cp;
if (!fnmatch(order[i], p, 0))
diff --git a/entry.c b/entry.c
index acc892f..39bee42 100644
--- a/entry.c
+++ b/entry.c
@@ -50,17 +50,20 @@ static void remove_subtree(const char *path)
struct dirent *de;
char pathbuf[PATH_MAX];
char *name;
+   size_t pathlen;
 
if (!dir)
die_errno("cannot opendir '%s'", path);
-   strcpy(pathbuf, path);
-   name = pathbuf + strlen(path);
+   strlcpy(pathbuf, path, PATH_MAX);
+   pathlen = strlen(path);
+   name = pathbuf + pathlen;
*name++ = '/';
+   pathlen++;
while ((de = readdir(dir)) != NULL) {
struct stat st;
if (is_dot_or_dotdot(de->d_name))
continue;
-   strcpy(name, de->d_name);
+   strlcpy(name, de->d_name, PATH_MAX - pathlen);
if (lstat(pathbuf, &st))
die_errno("cannot lstat '%s'", pathbuf);
if (S_ISDIR(st.st_mode))
@@ -244,8 +247,11 @@ int checkout_entry(struct cache_entry *ce,
if (topath)
return write_entry(ce, topath, state, 1);
 
+   if (len > PATH_MAX + 1)
+   die("Too long path: %s", state->base_dir);
+
memcpy(path, state->base_dir, len);
-   strcpy(path + len, ce->name);
+   strlcpy(path + len, ce->name, PATH_MAX + 1 - len);
len += ce_namelen(ce);
 
if (!check_path(path, len, &st, state->base_dir_len)) {
diff --git a/unpack-trees.c b/unpack-trees.c
index 1a61e6f..85473b1 100644
--- a/unpack-trees.c
+++ b/unpack-trees.c
@@ -918,6 +918,8 @@ static int clear_ce_flags_1(struct cache_entry **cache, int 
nr,
int processed;
 
len = slash - name;
+   if (len + prefix_len >= PATH_MAX)
+   len = PATH_MAX - prefix_len - 1;
memcpy(prefix + prefix_len, name, len);
 
/*
-- 
1.8.4.1.507.g9768648.dirty

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: separate-git-dir doesn't work with mapped drive

2013-10-19 Thread Fredrik Gustafsson
On Sat, Oct 19, 2013 at 11:49:27AM +0300, Ain Valtin wrote:
> Hi
> 
> I want to use git in a VirtualBox guest so that the repository is on
> the host drive. So in the VB settings for the guest I set up a shared
> folder "gitRepos" to /home/ain with full access rights. Then in the
> guest OS (Windows XP) I map this shared folder as G drive. Now in the
> project dir I execute
> 
> C:\...\TPP>git init --separate-git-dir g:/TPP
> Initialized empty Git repository in g:/TPP/
> 
> Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can
> write to the mapped dir) and in the .git file created to the project
> dir there is line
> 
> gitdir: g:/TPP
> 
> However when tring to use the repo it fails to recognise the g:/TPP path, ie
> 
> C:\...\TPP>git add .
> fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument
> 
> Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too:
> 
> C:\...\TPP>git add .
> fatal: Unable to create 'C:/Documents and
> Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock':
> No such file or directory
> 
> Am I doing something wrong or is it a bug? Any idea how to get it to work?
> 
> BTW the VB is 3.0.14 ie rather old version but it seems that this
> isn't the problem as the git init recognises the mapped drive but
> other commands fail.
> git version is 1.8.4.msysgit.0
> 
> 
> TIA
> ain
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Git has a bad history with mapped drives in windows. It's also usually a
bad idea to use git over the network (and most mapped drives are over
the network and not local between virt. machines).

I would advise not to use this setup since for the past two years that
git has sometime worked and sometimes not with this setup. (It's not
just seperate git dir, a git dir at all over a smb share have been
problematic).

(This is probably something we should have in a test-suite somewhere.)

-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iv...@iveqy.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: separate-git-dir doesn't work with mapped drive

2013-10-19 Thread Ain Valtin
On Sat, Oct 19, 2013 at 2:11 PM, Fredrik Gustafsson  wrote:
>
> Git has a bad history with mapped drives in windows. It's also usually a
> bad idea to use git over the network (and most mapped drives are over
> the network and not local between virt. machines).
>
> I would advise not to use this setup since for the past two years that
> git has sometime worked and sometimes not with this setup. (It's not
> just seperate git dir, a git dir at all over a smb share have been
> problematic).

That's a shame :(
As I wrote I want to use git in a virtual machine and as a extra
precaution it would be nice to have the repo outside of the VM, on the
host drive - should the VM not to start up for whatever reason I
wouldn't lose my repo with it...


ain
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: separate-git-dir doesn't work with mapped drive

2013-10-19 Thread Fredrik Gustafsson
On Sat, Oct 19, 2013 at 02:45:46PM +0300, Ain Valtin wrote:
> On Sat, Oct 19, 2013 at 2:11 PM, Fredrik Gustafsson  wrote:
> >
> > Git has a bad history with mapped drives in windows. It's also usually a
> > bad idea to use git over the network (and most mapped drives are over
> > the network and not local between virt. machines).
> >
> > I would advise not to use this setup since for the past two years that
> > git has sometime worked and sometimes not with this setup. (It's not
> > just seperate git dir, a git dir at all over a smb share have been
> > problematic).
> 
> That's a shame :(
> As I wrote I want to use git in a virtual machine and as a extra
> precaution it would be nice to have the repo outside of the VM, on the
> host drive - should the VM not to start up for whatever reason I
> wouldn't lose my repo with it...
> 
> 
> ain

A better way (if you can afford it) is to have a repo on the virtual
machine and push to a repo on your hostmachine (so that you've two
repos). However your solution "should" work but I personally have had
some problems with that area.

-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iv...@iveqy.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: GSoC 2014: Summary so far, discussion starter: how to improve?

2013-10-19 Thread Felipe Contreras
Thomas Rast wrote:
> * Diminishing returns: Git is too mature, with little low-hanging
>   fruit left, making such projects harder

Too mature? Aren't there other projects that are "too mature" as well? In
particular I'm thinking about the Linux kernel.

I think it's not about maturity, but how they deal with change.

> * Does libgit2 want to remain under the Git umbrella, or participate
>   on its own?

Personally I don't see what libgit2 has to do with git.git. There isn't even
communication between the two projects.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2 00/14] Officially start moving to the term 'staging area'

2013-10-19 Thread Philip Oakley

From: "Karsten Blees" 

Am 15.10.2013 00:29, schrieb Felipe Contreras:

tl;dr: everyone except Junio C Hamano and Drew Northup agrees; we
should move
away from the name "the index".

It has been discussed many times in the past that 'index' is not an
appropriate description for what the high-level user does with it,
and
it has been agreed that 'staging area' is the best term.



I haven't followed the previous discussion, but if a final conclusion
towards 'staging area' has already been reached, it should probably be
revised.


Do you mean that how that conclusion was reached should be summarised,
or that you don't think it's an appropriate summary of the broader
weltanschauung?




The term 'staging area' is more intuitive for newcomers which are
more
familiar with English than with Git, and it seems to be a
straightforward mental notion for people with different mother
tongues.

In fact it is so intuitive that it's used already in a lot online
documentation, and the people that do teach Git professionally use
this
term, because it's easier for many kinds of audiences to grasp.



Such online documentation often portraits the 'staging area' as some
supposedly 'unique' git feature, which I find _very_ confusing. In
fact, every major SCM has a staging area. E.g. you first need to
"svn/hg/bzr/p4 add/remove/rename/move" a file, which is somehow
recorded in the working copy. These recorded changes will then be
picked up by a subsequent "svn/hg/bzr/p4 commit/submit".

Additionally, all those systems support selectively committing
individual files (by specifying the files on the commit command line
or selecting them in a GUI).

So git's 'unique staging area' boils down to this:

1.) Recording individual files to commit in advance (instead of
specifying them at commit time). Which isn't that hard to grasp.


For many, that separation of preparation(s), from the final action, is
brand new and difficult to appreciate - it's special to computer systems
(where copying is 100% reliable, essentially instantaneous, and in Git's
case, 100% verifiable via crypto checksums).



2.) Recording individual diff hunks or even lines to commit. Which is
an advanced feature that requires even more advanced commands to be
useful (i.e. stash save --keep-index; make; test; commit; stash pop).
It is also entirely irrelevant to binary files (i.e. for non-technical
users that use git to track documents and stuff).


index: an 'index' is a guide of pointers to something else; a book
index has a list of entries so the reader can locate information
easily without having to go through the whole book. Git porcelain is
not using the staging area to find out entries quicker; it's not an
index.



The 'staging area' is a sorted list of most recently checked out
files, and its primary purpose is to quickly detect changes in the
working copy (i.e. its an index).



There is a big (human) problem here. We (humans) are able to believe
contradictory things ("He ain't heavy, he's my brother" to quote a
song). The Index (file) isn't a staging area, but we are happy to flip
flop between the two ideas depending on context - others can feel
confused.

In one sense the "Index" is an implementation detail of the concept of a
packing area where a shipment (commit) is prepared, which is most
commonly called the staging are in populist discussions (which I believe 
is the summary I mentioned above)



Of the 28 builtin main porcelain commands, 18 read the index
(read_index), and 11 of those also check the state of the working copy
(ie_match_stat). Incidentally, the latter include _all_ commands that
update the so-called 'staging area'.

Subversion until recently kept the checked out files scattered in
**/.svn/text-base directories, and many operations were terribly slow
due to this. Subversion 1.7 introduced a new working copy format,
based on a database in the root .svn directory (i.e. an index),
leading to tremendous performance improvements.

The git index is a major feature that facilitates the incredible
performance we're so much addicted to...why be shy about it and call
it something else?


stage: a 'stage' is a special area designated for convenience in
order
for some activity to take place; an orator would prepare a stage in
order for her speak to be successful, otherwise many people might not
be able to hear, or see her. Git porcelain is using the staging area
precisely as a special area to be separated from the working
directory
for convenience.



I'm not a native speaker, but in my limited understanding, 'staging'
in computer jargon is the process of preparing data for a production
system (i.e. until the 'stage' or 'state' of the data is ready for
production). It has nothing to do with the 'stage' in a theater. I've
never heard the term 'staging' beeing used for source code or software
(that would be 'integration'). I've also never heard 'staging' for
moving data back from a production system to some work- or development
area.


Even 'native' sp

Re: pack corruption post-mortem

2013-10-19 Thread Nicolas Pitre
On Sat, 19 Oct 2013, Duy Nguyen wrote:

> On Wed, Oct 16, 2013 at 3:34 PM, Jeff King  wrote:
> > I was recently presented with a repository with a corrupted packfile,
> > and was asked if the data was recoverable. This post-mortem describes
> > the steps I took to investigate and fix the problem. I thought others
> > might find the process interesting, and it might help somebody in the
> > same situation.
> >
> > I started with an fsck, which found a problem with exactly one object
> > (I've used $pack and $obj below to keep the output readable, and also
> > because I'll refer to them later):
> >
> > $ git fsck
> > error: $pack SHA1 checksum mismatch
> > error: index CRC mismatch for object $obj from $pack at offset 51653873
> > error: inflate: data stream error (incorrect data check)
> > error: cannot unpack $obj from $pack at offset 51653873
> 
> I wonder if we should protect the sha-1 and pathname tables in packv4
> with CRC too. A bit flipped in there could cause stream of corrupt
> objects and make it hard to pinpoint the corrupt location..

It turns out that we already have this covered.

The SHA1 used in the name of the pack file is actually the SHA1 checksum 
of the SHA1 table.

The path and ident tables are already protected by the CRC32 in the zlib 
deflated stream.

Normal objects are also zlib deflated (except for their header) but you 
need to inflate them in order to have this CRC verified, which the pack 
data copy tries to avoid.  Hence the separate CRC32 in the index file in 
that case.

However the pack v4 tables are very unlikely to be reused as is from one 
pack to another.


Nicolas
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] send-pack: don't send a thin pack when the server doesn't support it

2013-10-19 Thread Carlos Martín Nieto
On Tue, 2013-10-08 at 15:22 -0700, Jonathan Nieder wrote:
> Duy Nguyen wrote:
> 
> > Or maybe it's not that late. How about you go with your patch and add
> > thin-pack capability to receive-pack too?
> >
> > When new "git push" is used against old server, thin pack is disabled.
> > But that's not a big deal (I hope).
> 
> Could we have separate patches to introduce the server-side capability
> and then to request it in the client?  That way, people with old
> servers can apply the patch introducing the capability if they want.

That could work.

> 
> The new meaning of the "thin-pack" capability should also be
> documented in Documentation/technical/protocol-capabilities.txt.

Oh, right; the capability as described is only about the server being
able to generate a thin pack. Wouldn't his mean that git shouldn't
assume that *any* remote can fix thin packs, though? (other than most
servers you do talk to happen to).

Anyway, facts on the ground and all that. I'll prepare some 

> 
> Done that way and with enough time between the server advertising the
> capability and the client looking for it, it seems like a good idea.


If such patches would be accepted, that would be great. By the time this
all gets merged, we might have thin pack fixing merged into libgit2, but
there will still be uses where fixing them isn't an issue due to other
constraints.


Cheers,
   cmn


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: separate-git-dir doesn't work with mapped drive

2013-10-19 Thread Torsten Bögershausen
On 2013-10-19 10.49, Ain Valtin wrote:
> Hi
> 
> I want to use git in a VirtualBox guest so that the repository is on
> the host drive. So in the VB settings for the guest I set up a shared
> folder "gitRepos" to /home/ain with full access rights. Then in the
> guest OS (Windows XP) I map this shared folder as G drive. Now in the
> project dir I execute
> 
> C:\...\TPP>git init --separate-git-dir g:/TPP
> Initialized empty Git repository in g:/TPP/
> 
> Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can
> write to the mapped dir) and in the .git file created to the project
> dir there is line
> 
> gitdir: g:/TPP
> 
> However when tring to use the repo it fails to recognise the g:/TPP path, ie
> 
> C:\...\TPP>git add .
> fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument
> 
> Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too:
> 
> C:\...\TPP>git add .
> fatal: Unable to create 'C:/Documents and
> Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock':
> No such file or directory
> 
> Am I doing something wrong or is it a bug? Any idea how to get it to work?
> 
> BTW the VB is 3.0.14 ie rather old version but it seems that this
> isn't the problem as the git init recognises the mapped drive but
> other commands fail.
> git version is 1.8.4.msysgit.0
> 
> 
> TIA
> ain

(This could go to msysgit mailing list, so I add a CC)
(and I think, it is a known issue)

As a work around, could you try this:
> Downgrading to msysgit 1.8.3 fixes my problem.
and tell us if this helps you?

/Torsten



--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack corruption post-mortem

2013-10-19 Thread Shawn Pearce
On Sat, Oct 19, 2013 at 7:41 AM, Nicolas Pitre  wrote:
> On Sat, 19 Oct 2013, Duy Nguyen wrote:
>
>> On Wed, Oct 16, 2013 at 3:34 PM, Jeff King  wrote:
>> > I was recently presented with a repository with a corrupted packfile,
>> > and was asked if the data was recoverable. This post-mortem describes
>> > the steps I took to investigate and fix the problem. I thought others
>> > might find the process interesting, and it might help somebody in the
>> > same situation.
>> >
>> > I started with an fsck, which found a problem with exactly one object
>> > (I've used $pack and $obj below to keep the output readable, and also
>> > because I'll refer to them later):
>> >
>> > $ git fsck
>> > error: $pack SHA1 checksum mismatch
>> > error: index CRC mismatch for object $obj from $pack at offset 51653873
>> > error: inflate: data stream error (incorrect data check)
>> > error: cannot unpack $obj from $pack at offset 51653873
>>
>> I wonder if we should protect the sha-1 and pathname tables in packv4
>> with CRC too. A bit flipped in there could cause stream of corrupt
>> objects and make it hard to pinpoint the corrupt location..
>
> It turns out that we already have this covered.
>
> The SHA1 used in the name of the pack file is actually the SHA1 checksum
> of the SHA1 table.

I continue to believe this naming is wrong. The pack file name should
be the SHA1 checksum of the pack data stream, but the SHA1 table. This
would allow cleaner update of a repository that was repacked with
different compression settings, but identical objects.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 10/11] read-cache.c: fix memory leaks caused by removed cache entries

2013-10-19 Thread Thomas Rast
Karsten Blees  writes:

> When cache_entry structs are removed from index_state.cache, they are not
> properly freed. Freeing those entries wasn't possible before because we
> couldn't remove them from index_state.name_hash.
>
> Now that we _do_ remove the entries from name_hash, we can also free them.
> Add free(cache_entry) to all call sites of name-hash.c::remove_name_hash in
> read-cache.c, as name-hash.c isn't concerned with cache_entry allocation.
>
> cmd_rm and unmerge_index_entry_at use cache_entry.name after removing the
> entry. Copy the name so that we don't access memory that has been freed.

Is this the version that is currently in pu?  There's a valgrind test
failure in current pu that bisects to 36850edb, which would seem to be
from this email but doesn't have the right author date.  Worse, I cannot
apply this on top of 36850edb^ because I don't have the 'index' blobs
for this patch.  Confusing.

In any case 36850edb currently breaks several valgrind tests.  You can
valgrind only t6022.16 like so (that one test is sufficient to track it
down and it's much faster that way):

  cd t  
  ./t6022-merge-rename.sh --valgrind-only=16

The valgrind error in t6022.16 looks like this:

  ==4959== Invalid read of size 1
  ==4959==at 0x5682A38: vfprintf (vfprintf.c:1629)
  ==4959==by 0x56AC564: vsnprintf (vsnprintf.c:119)
  ==4959==by 0x542005: vreportf (usage.c:12)
  ==4959==by 0x54216C: error_builtin (usage.c:42)
  ==4959==by 0x54261B: error (usage.c:147)
  ==4959==by 0x4FC681: read_index_unmerged (read-cache.c:1900)
  ==4959==by 0x475CF1: reset_index (reset.c:68)
  ==4959==by 0x476A72: cmd_reset (reset.c:346)
  ==4959==by 0x405999: run_builtin (git.c:314)
  ==4959==by 0x405B2C: handle_internal_command (git.c:477)
  ==4959==by 0x405C46: run_argv (git.c:523)
  ==4959==by 0x405DE2: main (git.c:606)
  ==4959==  Address 0x5bedb54 is 84 bytes inside a block of size 104 free'd
  ==4959==at 0x4C2ACDA: free (vg_replace_malloc.c:468)
  ==4959==by 0x4F9360: remove_index_entry_at (read-cache.c:482)
  ==4959==by 0x4FA469: add_index_entry_with_check (read-cache.c:964)
  ==4959==by 0x4FA5A4: add_index_entry (read-cache.c:993)
  ==4959==by 0x4FC663: read_index_unmerged (read-cache.c:1899)
  ==4959==by 0x475CF1: reset_index (reset.c:68)
  ==4959==by 0x476A72: cmd_reset (reset.c:346)
  ==4959==by 0x405999: run_builtin (git.c:314)
  ==4959==by 0x405B2C: handle_internal_command (git.c:477)
  ==4959==by 0x405C46: run_argv (git.c:523)
  ==4959==by 0x405DE2: main (git.c:606)

If you need any more information/help, just ask :-)

-- 
Thomas Rast
t...@thomasrast.ch
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: separate-git-dir doesn't work with mapped drive

2013-10-19 Thread Ain Valtin
>
> (This could go to msysgit mailing list, so I add a CC)
> (and I think, it is a known issue)
>
> As a work around, could you try this:
>> Downgrading to msysgit 1.8.3 fixes my problem.
> and tell us if this helps you?

Yes, 1.8.3 seems to work (did initial commit successfully).

ain
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 1/2] Revert "test-lib: support running tests under valgrind in parallel"

2013-10-19 Thread Thomas Rast
This reverts commit ad0e6233320b004f0d686f6887c803e508607bd2.

--valgrind-parallel was broken from the start: during review I made
the whole valgrind setup code conditional on not being a
--valgrind-parallel worker child.  But even the children crucially
need $GIT_VALGRIND to be set; it should therefore have been set
outside the conditional.

The fix would be a two-liner, but since the introduction of the
feature, almost four months have passed without anyone noticing that
it is broken.  So this feature is not worth the about hundred lines of
test-lib.sh complexity.  Revert it.

Signed-off-by: Thomas Rast 
---
 t/test-lib.sh | 106 --
 1 file changed, 22 insertions(+), 84 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index 0fa7dfd..eaf6759 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -205,15 +205,6 @@ do
--valgrind-only=*)
valgrind_only=$(expr "z$1" : 'z[^=]*=\(.*\)')
shift ;;
-   --valgrind-parallel=*)
-   valgrind_parallel=$(expr "z$1" : 'z[^=]*=\(.*\)')
-   shift ;;
-   --valgrind-only-stride=*)
-   valgrind_only_stride=$(expr "z$1" : 'z[^=]*=\(.*\)')
-   shift ;;
-   --valgrind-only-offset=*)
-   valgrind_only_offset=$(expr "z$1" : 'z[^=]*=\(.*\)')
-   shift ;;
--tee)
shift ;; # was handled already
--root=*)
@@ -227,7 +218,7 @@ do
esac
 done
 
-if test -n "$valgrind_only" || test -n "$valgrind_only_stride"
+if test -n "$valgrind_only"
 then
test -z "$valgrind" && valgrind=memcheck
test -z "$verbose" && verbose_only="$valgrind_only"
@@ -377,9 +368,7 @@ maybe_teardown_verbose () {
 last_verbose=t
 maybe_setup_verbose () {
test -z "$verbose_only" && return
-   if match_pattern_list $test_count $verbose_only ||
-   { test -n "$valgrind_only_stride" &&
-   expr $test_count "%" $valgrind_only_stride - 
$valgrind_only_offset = 0 >/dev/null; }
+   if match_pattern_list $test_count $verbose_only
then
exec 4>&2 3>&1
# Emit a delimiting blank line when going from
@@ -403,7 +392,7 @@ maybe_teardown_valgrind () {
 
 maybe_setup_valgrind () {
test -z "$GIT_VALGRIND" && return
-   if test -z "$valgrind_only" && test -z "$valgrind_only_stride"
+   if test -z "$valgrind_only"
then
GIT_VALGRIND_ENABLED=t
return
@@ -412,10 +401,6 @@ maybe_setup_valgrind () {
if match_pattern_list $test_count $valgrind_only
then
GIT_VALGRIND_ENABLED=t
-   elif test -n "$valgrind_only_stride" &&
-   expr $test_count "%" $valgrind_only_stride - 
$valgrind_only_offset = 0 >/dev/null
-   then
-   GIT_VALGRIND_ENABLED=t
fi
 }
 
@@ -568,9 +553,6 @@ test_done () {
esac
 }
 
-
-# Set up a directory that we can put in PATH which redirects all git
-# calls to 'valgrind git ...'.
 if test -n "$valgrind"
 then
make_symlink () {
@@ -618,42 +600,33 @@ then
make_symlink "$symlink_target" "$GIT_VALGRIND/bin/$base" || exit
}
 
-   # In the case of --valgrind-parallel, we only need to do the
-   # wrapping once, in the main script.  The worker children all
-   # have $valgrind_only_stride set, so we can skip based on that.
-   if test -z "$valgrind_only_stride"
-   then
-   # override all git executables in TEST_DIRECTORY/..
-   GIT_VALGRIND=$TEST_DIRECTORY/valgrind
-   mkdir -p "$GIT_VALGRIND"/bin
-   for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-*
-   do
-   make_valgrind_symlink $file
-   done
-   # special-case the mergetools loadables
-   make_symlink "$GIT_BUILD_DIR"/mergetools 
"$GIT_VALGRIND/bin/mergetools"
-   OLDIFS=$IFS
-   IFS=:
-   for path in $PATH
+   # override all git executables in TEST_DIRECTORY/..
+   GIT_VALGRIND=$TEST_DIRECTORY/valgrind
+   mkdir -p "$GIT_VALGRIND"/bin
+   for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-*
+   do
+   make_valgrind_symlink $file
+   done
+   # special-case the mergetools loadables
+   make_symlink "$GIT_BUILD_DIR"/mergetools "$GIT_VALGRIND/bin/mergetools"
+   OLDIFS=$IFS
+   IFS=:
+   for path in $PATH
+   do
+   ls "$path"/git-* 2> /dev/null |
+   while read file
do
-   ls "$path"/git-* 2> /dev/null |
-   while read file
-   do
-   make_valgrind_symlink "$file"
-   done
+   make_valgrind_symlink "$file"
done
-   IFS=$OLDIFS
-   fi
+   done
+   IFS=$OLDIFS
PATH=$GI

[PATCH 0/2] Revert --valgrind-parallel test option

2013-10-19 Thread Thomas Rast
These patches remove the --valgrind-parallel=N option that was broken
from the outset (shame on me).  Peff's judgement at the time that its
usefulness would approximately be "meh" turns out to be correct.

What's not in the commit message, but drives part of my reasoning in
doing a revert instead of a fix: I did fix it up locally only to
notice that it was too slow in this case for what I actually wanted to
use it for.  The only valgrind-test workflow that I find bearable is
to run all the tests in the background under prove (takes hours), and
then use the prove output (which says exactly which subtests fail) in
--valgrind-only=.  So the latter is -- again Peff was right
-- the really useful thing.

The only consolation is that I apparently didn't break any other use
of the test suite -- otherwise it would presumably have been fixed
very quickly.

Thomas Rast (2):
  Revert "test-lib: support running tests under valgrind in parallel"
  Revert "test-lib: allow prefixing a custom string before "ok N" etc."

 t/test-lib.sh | 133 +++---
 1 file changed, 34 insertions(+), 99 deletions(-)

-- 
1.8.4.1.810.g312044e

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] Revert "test-lib: allow prefixing a custom string before "ok N" etc."

2013-10-19 Thread Thomas Rast
Now that ad0e623 (test-lib: support running tests under valgrind in
parallel, 2013-06-23) has been reverted, this support code has no
users any more.  Revert it, too.

This reverts commit e939e15d241e942662b9f88f6127ab470ab0a0b9.
---
 t/test-lib.sh | 27 ---
 1 file changed, 12 insertions(+), 15 deletions(-)

diff --git a/t/test-lib.sh b/t/test-lib.sh
index eaf6759..29c1410 100644
--- a/t/test-lib.sh
+++ b/t/test-lib.sh
@@ -210,9 +210,6 @@ do
--root=*)
root=$(expr "z$1" : 'z[^=]*=\(.*\)')
shift ;;
-   --statusprefix=*)
-   statusprefix=$(expr "z$1" : 'z[^=]*=\(.*\)')
-   shift ;;
*)
echo "error: unknown test option '$1'" >&2; exit 1 ;;
esac
@@ -320,12 +317,12 @@ trap 'die' EXIT
 
 test_ok_ () {
test_success=$(($test_success + 1))
-   say_color "" "${statusprefix}ok $test_count - $@"
+   say_color "" "ok $test_count - $@"
 }
 
 test_failure_ () {
test_failure=$(($test_failure + 1))
-   say_color error "${statusprefix}not ok $test_count - $1"
+   say_color error "not ok $test_count - $1"
shift
echo "$@" | sed -e 's/^/#   /'
test "$immediate" = "" || { GIT_EXIT_OK=t; exit 1; }
@@ -333,12 +330,12 @@ test_failure_ () {
 
 test_known_broken_ok_ () {
test_fixed=$(($test_fixed+1))
-   say_color error "${statusprefix}ok $test_count - $@ # TODO known 
breakage vanished"
+   say_color error "ok $test_count - $@ # TODO known breakage vanished"
 }
 
 test_known_broken_failure_ () {
test_broken=$(($test_broken+1))
-   say_color warn "${statusprefix}not ok $test_count - $@ # TODO known 
breakage"
+   say_color warn "not ok $test_count - $@ # TODO known breakage"
 }
 
 test_debug () {
@@ -462,8 +459,8 @@ test_skip () {
of_prereq=" of $test_prereq"
fi
 
-   say_color skip >&3 "${statusprefix}skipping test: $@"
-   say_color skip "${statusprefix}ok $test_count # skip $1 
(missing $missing_prereq${of_prereq})"
+   say_color skip >&3 "skipping test: $@"
+   say_color skip "ok $test_count # skip $1 (missing 
$missing_prereq${of_prereq})"
: true
;;
*)
@@ -501,11 +498,11 @@ test_done () {
 
if test "$test_fixed" != 0
then
-   say_color error "${statusprefix}# $test_fixed known breakage(s) 
vanished; please update test(s)"
+   say_color error "# $test_fixed known breakage(s) vanished; 
please update test(s)"
fi
if test "$test_broken" != 0
then
-   say_color warn "${statusprefix}# still have $test_broken known 
breakage(s)"
+   say_color warn "# still have $test_broken known breakage(s)"
fi
if test "$test_broken" != 0 || test "$test_fixed" != 0
then
@@ -528,9 +525,9 @@ test_done () {
then
if test $test_remaining -gt 0
then
-   say_color pass "${statusprefix}# passed all 
$msg"
+   say_color pass "# passed all $msg"
fi
-   say "${statusprefix}1..$test_count$skip_all"
+   say "1..$test_count$skip_all"
fi
 
test -d "$remove_trash" &&
@@ -544,8 +541,8 @@ test_done () {
*)
if test $test_external_has_tap -eq 0
then
-   say_color error "${statusprefix}# failed $test_failure 
among $msg"
-   say "${statusprefix}1..$test_count"
+   say_color error "# failed $test_failure among $msg"
+   say "1..$test_count"
fi
 
exit 1 ;;
-- 
1.8.4.1.810.g312044e

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: GSoC 2014: Summary so far, discussion starter: how to improve?

2013-10-19 Thread Fredrik Gustafsson
Hi,
so I was a GSoC:er, I got some (most) of my code merged but didn't fully
met my (personal) goals for the project. However I do passed in the eyes
of Google.

GSoC is _hard_. You end up feeling completely stupid over and over
again. Git has hard standards. Beeing just a single programmer and/or
just learnt programming in school, there's a lot of difference.

I started with learning git (better), read documentation and looking
at the codebase and still felt lost.

After that I'd to learn communication skills, who to mail, when to mail,
how to write a commit message, been real strict with codestyle,
setting up a github account, configuring git in a "git contributor
friendly way", etc.

On Sat, Oct 19, 2013 at 08:09:33AM +0200, Thomas Rast wrote:
> Theories
> 
> 
> These are the hypotheses that I have heard (mostly in [1] and [2]) as
> to what is bad about Git's prior GSoC participations.
> 
> * Aiming far too high, focusing on cool/shiny projects with a large
>   impact.  This also affects the students, who tend to cluster around
>   the largest, shiniest project suggestions.
> 
> * Diminishing returns: Git is too mature, with little low-hanging
>   fruit left, making such projects harder
> 
> * Projects are too political, progress depending on non-technical
>   arguments
> 
> * Our mentors suck on various axes, notably not supporting students
>   enough in things that matter:
>   - smooth interaction with community
>   - ensure fast iteration/short cycles
>   - navigating the code base
> 
> * Scope creep: projects tend to get blocked on some bigger
>   refactoring/restructuring task that was not in the original proposal
> 
> 
> 
> * View GSoC much more as a lot of work than free labor

Totally agree, GSoC is an investment for future labor, not labor.

> 
> * Break projects into smaller, easier tasks
>   - They should individually be simple, quick things if the mentor did
> them.
>   - Should be parallelizable so students don't have to block on reviews.

I'd 5-6 smaller projects setup for the summer, I think I managed to do
2-3 of them. (I did however do everything I applied for). I really think
it's an excellent idea. This also meant that while one patch waited for
review, I'd other things to work on.

> 
> * Mentoring improvements:
>   - Always have a co-mentor
>   - Focus on social aspects (who to Cc, etc.)
>   - Nominate separate "review mentors" to ensure fast review cycles

I like the idea of review mentors. However bear in mind that you'll
already have three people reviewing the patches (two mentors and Junio).
We will not make it look like it's impossible to get things into
git.git.

> * Have students review some patches

This would be excellent. That's a part that I thinks is very usefull and
would easy students remaining with git. It's easier to review patches
than to make them.


As a last part I would say that GSoC learned me a lot. I'm good at git,
I know test driven development, I learned shell, I got to play with a
huge C-codebase for the first time and I learned open source projects,
QA, etc.

I would like to thank Jens and Heiko for good mentoring and a lot of
patience!

(as a sidenote, I did get extremly busy when the school started. I
didn't even had time to fix a serious bug in my code (Jens had to clean
up after me). However two years later I'd some time again and got a few
patches in and I hope to get a few patches into git in the future too).

A successful GSoC for git isn't a merged project but continued
contribution to git (not necessairly in patches, but also in support and
review).

A successful GSoC for Google/student is a merged project.

A successful GSoC for student is a great learning experience.
-- 
Med vänliga hälsningar
Fredrik Gustafsson

tel: 0733-608274
e-post: iv...@iveqy.com

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v8] diff.c: keep arrow(=>) on show_stats()'s shortened filename part to make rename visible

2013-10-19 Thread Yoshioka Tsuneo
Hello Thomas

> Can you briefly describe what you changed in v7 and v8, both compared to
> earlier versions and between v7 and v8?
On v7, 's basename part is tried to kept. On v7, whole  part is tried 
to kept.
For example, in case below:
   parent_path{sourceDirectory => 
DestinationDirectory}path1/path2//longlongFilename.txt
 On v7, this can be like:
   …{...ceDirectory => …onDirectory}.../longlongFilename.txt
On v8, it will be like:
   …{...irectory => …irectory}path1/path2/longlongFilename.txt


This change is based on the review from Junio below.
(I myself is not sure what is the better way.)

On Oct 17, 2013, at 10:29 PM, Junio C Hamano  wrote:
> I am not sure if distributing the burden of truncation equally to
> three parts so that the resulting pieces are of similar lengths is
> really a good idea.  Between these two
> 
>   {...SourceDirectory => ...nationDirectory}...ileThatWasMoved 
>   {...ceDirectory => ...ionDirectory}nameOfTheFileThatWasMoved
> 
> that attempt to show that the file nameOfTheFileThatWasMoved was
> moved from the longSourceDirectory to the DestinationDirectory, the
> latter is much more informative, I would think.


On Oct 18, 2013, at 1:38 AM, Junio C Hamano  wrote:
> Yoshioka Tsuneo  writes:
> 
>> In the "[PATCH v7]", I changed to keep filename part of suffix to handle
>> above case, but not always keep directory part because I feel totally
>> keeping all part of long suffix including directory name may cause output 
>> like:
>>…{… => …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved 
>> And, above may be worse than:
>>   ...{...ceDirectory => …ionDirectory}.../nameOfTheFileThatWasMoved
>> I think.
> 
> I am not sure if I agree.
> 
> Losing LongPath2 part may be more significant data loss than losing
> a single bit that says the change is a rename, as the latter may not
> quite tell us what these two directories were anyway.


Also, I guess Junio might be suspicious to the idea to keep arrow("=>") itself, 
maybe ?
=
(From What's cooking in git.git (Oct 2013, #04; Fri, 18))
- diff.c: keep arrow(=>) on show_stats()'s shortened filename part to make 
rename visible

Attempts to give more weight on the fact that a filepair represents
a rename than showing substring of the actual path when diffstat
lines are not wide enough.

I am not sure if that is solving a right problem, though.
=

Thanks!

---
Tsuneo Yoshioka (吉岡 恒夫)
yoshiokatsu...@gmail.com




On Oct 19, 2013, at 9:24 AM, Thomas Rast  wrote:

> Yoshioka Tsuneo  writes:
> 
>> "git diff -M --stat" can detect rename and show renamed file name like
>> "foofoofoo => barbarbar".
>> 
>> Before this commit, this output is shortened always by omitting left most
>> part like "...foo => barbarbar". So, if the destination filename is too long,
>> source filename putting left or arrow can be totally omitted like
>> "...barbarbar", without including any of "foofoofoo =>".
>> In such a case where arrow symbol is omitted, there is no way to know
>> whether the file is renamed or existed in the original.
>> 
>> Make sure there is always an arrow, like "...foo => ...bar".
>> 
>> The output can contain curly braces('{','}') for grouping.
>> So, in general, the output format is "{ => }"
>> 
>> To keep arrow("=>"), try to omit  as long as possible at first
>> because later part or changing part will be the more important part.
>> If it is not enough, shorten ,  trying to have the same
>> maximum length.
>> If it is not enough yet, omit .
>> 
>> Signed-off-by: Tsuneo Yoshioka 
>> Test-added-by: Thomas Rast 
>> ---
> 
> Can you briefly describe what you changed in v7 and v8, both compared to
> earlier versions and between v7 and v8?
> 
> It would be very nice if you could always include such a "patch
> changelog" after the "---" above.  git-am will ignore the text between
> "---" and the diff, so you can write comments for the reviewers there
> without creating noise in the commit message.
> 
> Also, please keep reviewers in the Cc list for future discussion/patches
> so that they will see them.
> 
> -- 
> Thomas Rast
> t...@thomasrast.ch

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: pack corruption post-mortem

2013-10-19 Thread Duy Nguyen
On Sat, Oct 19, 2013 at 9:41 PM, Nicolas Pitre  wrote:
> On Sat, 19 Oct 2013, Duy Nguyen wrote:
> The SHA1 used in the name of the pack file is actually the SHA1 checksum
> of the SHA1 table.
>
> The path and ident tables are already protected by the CRC32 in the zlib
> deflated stream.
>
> Normal objects are also zlib deflated (except for their header) but you
> need to inflate them in order to have this CRC verified, which the pack
> data copy tries to avoid.  Hence the separate CRC32 in the index file in
> that case.

OK slight change in the subject, what about reading code (i.e.
sha1_file.c)? With v2 crc32 is verified by object inflate code. With
v4 trees or commits, because we store some (or all) data outside of
the deflated stream, we will not benefit from crc32 verifcation
previously done for all trees and commits. Should we perform explict
crc32 check when reading v4 trees and commits (and maybe verify the
sha-1 table too)?
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [msysGit] [PATCH] Prevent buffer overflows when path is too big

2013-10-19 Thread Torsten Bögershausen
(may be s/path is too big/path is too long/ ?)

On 19.10.13 12:52, Antoine Pelisse wrote:
> Currently, most buffers created with PATH_MAX length, are not checked
> when being written, and can overflow if PATH_MAX is not big enough to
> hold the path.
> 
> Fix that by using strlcpy() where strcpy() was used, and also run some
> extra checks when copy is done with memcpy().
> 
> Reported-by: Wataru Noguchi 
> Signed-off-by: Antoine Pelisse 
> ---
>  abspath.c| 10 +++---
>  diffcore-order.c |  2 +-
>  entry.c  | 14 ++
>  unpack-trees.c   |  2 ++
>  4 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/abspath.c b/abspath.c
> index 64adbe2..0e60ba4 100644
> --- a/abspath.c
> +++ b/abspath.c
> @@ -216,11 +216,15 @@ const char *absolute_path(const char *path)
>  const char *prefix_filename(const char *pfx, int pfx_len, const char *arg)
>  {
>   static char path[PATH_MAX];
> +
> + if (pfx_len > PATH_MAX)
I think this should be 
if (pfx_len > PATH_MAX-1) /* Keep 1 char for '\0'
> + die("Too long prefix path: %s", pfx);
> +
>  #ifndef GIT_WINDOWS_NATIVE
>   if (!pfx_len || is_absolute_path(arg))
>   return arg;
>   memcpy(path, pfx, pfx_len);
> - strcpy(path + pfx_len, arg);
> + strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len);

I'm not sure how to handle overlong path in general, there are several ways:
a) Silently overwrite memory (with help of memcpy() and/or strcpy()
b) Silently shorten the path using strlcpy() instead of strcpy()
c) Avoid the overwriting and call die().
d) Prepare a longer buffer using xmalloc()

Today we do a), this is not a good thing and the worst choice.


A little side note:
  It would be good to have test cases for either b), c) or d).

  As PATH_MAX is OS dependend, we need both a main program written in c
  and a test case written in t/t.sh.
  Some existing code can be used for inspiration, e.g. 
  test-wildmatch.c in combination with t/t3070-wildmatch.sh
  This willl allow us to reproduce the error, and define how git should behave.

End of the side note, let's look closer at the suggested patch, implementing b)

Silently shortening an overlong path like
"/foo/bar/baz" could result something like

"/foo/bar/ba" /* That filename may be part of the repo too */
or
"/foo/bar/" /*  This is a directory, not a file name */

In either case the end user has no idea why git choose another file name.
And this could be hard to debug.
After a couple of hours she/he may send a message asking for help to the 
mailing list,
and we end up in more people doing debugging.

c) Is much easier to debug:
  Git can not handle this situation, and we print out the parameters in die()

I would prefer c) over b), make clear that git can't handle that situation.

d) Would mean some more re-factoring: Check all callers to prefix_filename().
Some of them call xstrdup() after prefix_filename(), which mean that we could
change prefix_filename() to always return new string which is long enough via 
xmalloc(),
and not a static buffer.

So we come to the next point (and this is my personal experience,
so please don't get me wrong):
how much time can you spend on this?

If the answer is kind of "very little", I would go for c)
  Avoid the silent memory corruption, and say to the user "we can not handle 
this"

If the answer is kind of "little", I would go for c) and a test program,
  covering all the different code path in abspath()
  (WHich may deserve a refactoring as well, since the code for 
GIT_WINDOWS_NATIVE
  is very similar to the non-GIT_WINDOWS_NATIVE)

If the answer is kind of "more than little", a different strategie may be 
better:
  Start sending a patch for c)
  I think we have enough volunteers here for a review, so we can life without 
the test code.
  On top of that, some volunteer can develop d).
  
So far I have only looked at abspath(), and your patch touches more places.
I think more and more that calling die()
with all information included why we call die() is a good starting point.

It will allow the users to see what is going on.
May be the repo can be re-arranged to use shorter path names than what we can 
handle.
[snip]
 
/Torsten

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [msysGit] [PATCH] Prevent buffer overflows when path is too big

2013-10-19 Thread Ondřej Bílka
On Sun, Oct 20, 2013 at 07:47:06AM +0200, Torsten Bögershausen wrote:
> (may be s/path is too big/path is too long/ ?)
> 
> On 19.10.13 12:52, Antoine Pelisse wrote:
> > Currently, most buffers created with PATH_MAX length, are not checked
> > when being written, and can overflow if PATH_MAX is not big enough to
> > hold the path.
> > 
> > Fix that by using strlcpy() where strcpy() was used, and also run some
> > extra checks when copy is done with memcpy().
> > 
> > Reported-by: Wataru Noguchi 
> > Signed-off-by: Antoine Pelisse 
> > ---
> > diff --git a/abspath.c b/abspath.c
> > index 64adbe2..0e60ba4 100644
> > --- a/abspath.c
> > +++ b/abspath.c
> > @@ -216,11 +216,15 @@ const char *absolute_path(const char *path)
> >  const char *prefix_filename(const char *pfx, int pfx_len, const char *arg)
> >  {
> > static char path[PATH_MAX];

Why do you need static there?
> > +
> > +   if (pfx_len > PATH_MAX)
> I think this should be 
> if (pfx_len > PATH_MAX-1) /* Keep 1 char for '\0'
> > +   die("Too long prefix path: %s", pfx);
> > +
> >  #ifndef GIT_WINDOWS_NATIVE
> > if (!pfx_len || is_absolute_path(arg))
> > return arg;
> > memcpy(path, pfx, pfx_len);
> > -   strcpy(path + pfx_len, arg);
> > +   strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len);
> 
> I'm not sure how to handle overlong path in general, there are several ways:
> a) Silently overwrite memory (with help of memcpy() and/or strcpy()
> b) Silently shorten the path using strlcpy() instead of strcpy()
> c) Avoid the overwriting and call die().
> d) Prepare a longer buffer using xmalloc()
> 
There is also
e) modify allocation to place write protected page after buffer end.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [msysGit] [PATCH] Prevent buffer overflows when path is too big

2013-10-19 Thread Torsten Bögershausen
On 20.10.13 08:05, Ondřej Bílka wrote:
> On Sun, Oct 20, 2013 at 07:47:06AM +0200, Torsten Bögershausen wrote:
>> (may be s/path is too big/path is too long/ ?)
>>
>> On 19.10.13 12:52, Antoine Pelisse wrote:
>>> Currently, most buffers created with PATH_MAX length, are not checked
>>> when being written, and can overflow if PATH_MAX is not big enough to
>>> hold the path.
>>>
>>> Fix that by using strlcpy() where strcpy() was used, and also run some
>>> extra checks when copy is done with memcpy().
>>>
>>> Reported-by: Wataru Noguchi 
>>> Signed-off-by: Antoine Pelisse 
>>> ---
>>> diff --git a/abspath.c b/abspath.c
>>> index 64adbe2..0e60ba4 100644
>>> --- a/abspath.c
>>> +++ b/abspath.c
>>> @@ -216,11 +216,15 @@ const char *absolute_path(const char *path)
>>>  const char *prefix_filename(const char *pfx, int pfx_len, const char *arg)
>>>  {
>>> static char path[PATH_MAX];
> 
> Why do you need static there?
Good point.
get_pathname() from path.c may be better.

>>> +
>>> +   if (pfx_len > PATH_MAX)
>> I think this should be 
>> if (pfx_len > PATH_MAX-1) /* Keep 1 char for '\0'
>>> +   die("Too long prefix path: %s", pfx);
>>> +
>>>  #ifndef GIT_WINDOWS_NATIVE
>>> if (!pfx_len || is_absolute_path(arg))
>>> return arg;
>>> memcpy(path, pfx, pfx_len);
>>> -   strcpy(path + pfx_len, arg);
>>> +   strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len);
>>
>> I'm not sure how to handle overlong path in general, there are several ways:
>> a) Silently overwrite memory (with help of memcpy() and/or strcpy()
>> b) Silently shorten the path using strlcpy() instead of strcpy()
>> c) Avoid the overwriting and call die().
>> d) Prepare a longer buffer using xmalloc()
>>
> There is also
> e) modify allocation to place write protected page after buffer end.

Yes, I think this is what electric fence, DUMA or valgrind do:

http://sourceforge.jp/projects/freshmeat_efence/
http://duma.sourceforge.net/
http://valgrind.sourceforge.net/

Theses are very good tools for developers, finding memory corruption
(or other bugs like using uninitialized memory).

One of the motivation I asked for test cases is that a git developer can
run these test cases under valgrind and can verify that we are never out of 
range.

For an end user a git "crash" caused by trying to write to a write protected 
page
is better than silently corrupting memory.

And a range check, followed by die(), is even easier to debug.
For an end user.
/Torsten

















--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html