Re: What's cooking in git.git (Oct 2013, #04; Fri, 18)
Jeff King wrote: > On Fri, Oct 18, 2013 at 03:14:49PM -0700, Junio C Hamano wrote: > > > * jn/add-2.0-u-A-sans-pathspec (2013-04-26) 1 commit > > - git add: -u/-A now affects the entire working tree > > > > Will cook in 'next' until Git 2.0. > > > > > > * jc/core-checkstat-2.0 (2013-05-06) 1 commit > > - core.statinfo: remove as promised in Git 2.0 > > > > Will cook in 'next' until Git 2.0. > > > > > > * jc/push-2.0-default-to-simple (2013-06-18) 1 commit > > - push: switch default from "matching" to "simple" > > > > Will cook in 'next' until Git 2.0. > > > > > > * jc/add-2.0-ignore-removal (2013-04-22) 1 commit > > - git add ... defaults to "-A" > > > > Updated endgame for "git add " that defaults to "--all" > > aka "--no-ignore-removal". > > > > Will cook in 'next' until Git 2.0. > > I notice that these are not actually in 'next', despite the > descriptions. Should they be, to give them wider exposure? I say they shouldn't be, unless the next version after 1.8 is 2.0. There should be a separate branch for 2.0. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
separate-git-dir doesn't work with mapped drive
Hi I want to use git in a VirtualBox guest so that the repository is on the host drive. So in the VB settings for the guest I set up a shared folder "gitRepos" to /home/ain with full access rights. Then in the guest OS (Windows XP) I map this shared folder as G drive. Now in the project dir I execute C:\...\TPP>git init --separate-git-dir g:/TPP Initialized empty Git repository in g:/TPP/ Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can write to the mapped dir) and in the .git file created to the project dir there is line gitdir: g:/TPP However when tring to use the repo it fails to recognise the g:/TPP path, ie C:\...\TPP>git add . fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too: C:\...\TPP>git add . fatal: Unable to create 'C:/Documents and Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock': No such file or directory Am I doing something wrong or is it a bug? Any idea how to get it to work? BTW the VB is 3.0.14 ie rather old version but it seems that this isn't the problem as the git init recognises the mapped drive but other commands fail. git version is 1.8.4.msysgit.0 TIA ain -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: separate-git-dir doesn't work with mapped drive
There seems to be some regression fixes regarding dos drives. The one that caught my eyes is 7fbd422 (relative_path should honor dos-drive-prefix - 2013-10-14) but it's not released yet. And I'm not sure if msys branch picks it up yet even if you want to rebuild and test it yourself. Copying Jiang Xin, maybe he can tell if that commit should fix what you describe here, or it's a new bug. -- Duy On Sat, Oct 19, 2013 at 3:49 PM, Ain Valtin wrote: > Hi > > I want to use git in a VirtualBox guest so that the repository is on > the host drive. So in the VB settings for the guest I set up a shared > folder "gitRepos" to /home/ain with full access rights. Then in the > guest OS (Windows XP) I map this shared folder as G drive. Now in the > project dir I execute > > C:\...\TPP>git init --separate-git-dir g:/TPP > Initialized empty Git repository in g:/TPP/ > > Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can > write to the mapped dir) and in the .git file created to the project > dir there is line > > gitdir: g:/TPP > > However when tring to use the repo it fails to recognise the g:/TPP path, ie > > C:\...\TPP>git add . > fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument > > Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too: > > C:\...\TPP>git add . > fatal: Unable to create 'C:/Documents and > Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock': > No such file or directory > > Am I doing something wrong or is it a bug? Any idea how to get it to work? > > BTW the VB is 3.0.14 ie rather old version but it seems that this > isn't the problem as the git init recognises the mapped drive but > other commands fail. > git version is 1.8.4.msysgit.0 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pack corruption post-mortem
On Wed, Oct 16, 2013 at 3:34 PM, Jeff King wrote: > I was recently presented with a repository with a corrupted packfile, > and was asked if the data was recoverable. This post-mortem describes > the steps I took to investigate and fix the problem. I thought others > might find the process interesting, and it might help somebody in the > same situation. > > I started with an fsck, which found a problem with exactly one object > (I've used $pack and $obj below to keep the output readable, and also > because I'll refer to them later): > > $ git fsck > error: $pack SHA1 checksum mismatch > error: index CRC mismatch for object $obj from $pack at offset 51653873 > error: inflate: data stream error (incorrect data check) > error: cannot unpack $obj from $pack at offset 51653873 I wonder if we should protect the sha-1 and pathname tables in packv4 with CRC too. A bit flipped in there could cause stream of corrupt objects and make it hard to pinpoint the corrupt location.. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: GSoC 2014: Summary so far, discussion starter: how to improve?
Hi, On Sat, Oct 19, 2013 at 8:09 AM, Thomas Rast wrote: > > Previous Episodes > = > > Git participated in Google Summer of Code (GSoC) 2007-2012, but did not > participate in 2013 based on discussion in February [1]. At Git-Merge > in Berlin there was a discussion round [2] that this post attempts to > summarize as a basis for further discussion and (hopefully) > participation in GSoC'14. > > Much sooner than in previous years, Google has already confirmed that > GSoC'14 will in fact happen [3]. > > Note that while mistakes herein are mine, many ideas and opinions > aren't. This really tries to be a summary. Please flame >/dev/null, > not me. Thank you for sending this very good summary. > Theories > > > These are the hypotheses that I have heard (mostly in [1] and [2]) as > to what is bad about Git's prior GSoC participations. > > * Aiming far too high, focusing on cool/shiny projects with a large > impact. This also affects the students, who tend to cluster around > the largest, shiniest project suggestions. Yeah, focusing on _too big_ projects. > * Diminishing returns: Git is too mature, with little low-hanging > fruit left, making such projects harder > > * Projects are too political, progress depending on non-technical > arguments > > * Our mentors suck on various axes, notably not supporting students > enough in things that matter: > - smooth interaction with community > - ensure fast iteration/short cycles > - navigating the code base Yeah, "fast iteration/short cycles" means submitting the work early and often _to the mailing list_. One of the thing we learned too is that focusing on selecting the "best" students didn't really worked. What worked was selecting students who have already contributed patches, but unfortunately there are not many potential students who have already contributed. > Further steps > = > > * Discuss :-) > > And then apply this hard-won knowledge systematically. In > particular I think it wouldn't hurt to keep the project proposals > out of this thread until we have agreed on goals/standards to > measure them against. Based on the above, what about the following: 1) Advertise early and widely that we will very strongly favor students who have already contributed and that we can help them contribute long before their application process starts. Maybe we could even have a pre GSoC application process for potential students. 2) Advertise that we will really favor small projects over big/shiny ones. 3) Come up with a list of criteria like the following to measure student/application: - has the student already contributed much: 0-30 points: 0 means no contribution to any open source project, 5 means some contribution to another open source project, 20 means long time Git contributor - what is the size of the project: 0-10 points: 0 means big project (one month or more for a regular contributor) 10 means small project (one week for a regular contributor) - does the student appear willing to act on advice: 0-5 points - does the student appear to have the necessary skills: 0-5 points - does the proposal look good: 0-5 points > * Gather a list of potential mentors. I would be happy to mentor next year. Thanks, Christian. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] Prevent buffer overflows when path is too big
Currently, most buffers created with PATH_MAX length, are not checked when being written, and can overflow if PATH_MAX is not big enough to hold the path. Fix that by using strlcpy() where strcpy() was used, and also run some extra checks when copy is done with memcpy(). Reported-by: Wataru Noguchi Signed-off-by: Antoine Pelisse --- abspath.c| 10 +++--- diffcore-order.c | 2 +- entry.c | 14 ++ unpack-trees.c | 2 ++ 4 files changed, 20 insertions(+), 8 deletions(-) diff --git a/abspath.c b/abspath.c index 64adbe2..0e60ba4 100644 --- a/abspath.c +++ b/abspath.c @@ -216,11 +216,15 @@ const char *absolute_path(const char *path) const char *prefix_filename(const char *pfx, int pfx_len, const char *arg) { static char path[PATH_MAX]; + + if (pfx_len > PATH_MAX) + die("Too long prefix path: %s", pfx); + #ifndef GIT_WINDOWS_NATIVE if (!pfx_len || is_absolute_path(arg)) return arg; memcpy(path, pfx, pfx_len); - strcpy(path + pfx_len, arg); + strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len); #else char *p; /* don't add prefix to absolute paths, but still replace '\' by '/' */ @@ -228,8 +232,8 @@ const char *prefix_filename(const char *pfx, int pfx_len, const char *arg) pfx_len = 0; else if (pfx_len) memcpy(path, pfx, pfx_len); - strcpy(path + pfx_len, arg); - for (p = path + pfx_len; *p; p++) + strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len); + for (p = path + pfx_len; p < path + PATH_MAX && *p; p++) if (*p == '\\') *p = '/'; #endif diff --git a/diffcore-order.c b/diffcore-order.c index 23e9385..f083c82 100644 --- a/diffcore-order.c +++ b/diffcore-order.c @@ -76,7 +76,7 @@ static int match_order(const char *path) char p[PATH_MAX]; for (i = 0; i < order_cnt; i++) { - strcpy(p, path); + strlcpy(p, path, PATH_MAX); while (p[0]) { char *cp; if (!fnmatch(order[i], p, 0)) diff --git a/entry.c b/entry.c index acc892f..39bee42 100644 --- a/entry.c +++ b/entry.c @@ -50,17 +50,20 @@ static void remove_subtree(const char *path) struct dirent *de; char pathbuf[PATH_MAX]; char *name; + size_t pathlen; if (!dir) die_errno("cannot opendir '%s'", path); - strcpy(pathbuf, path); - name = pathbuf + strlen(path); + strlcpy(pathbuf, path, PATH_MAX); + pathlen = strlen(path); + name = pathbuf + pathlen; *name++ = '/'; + pathlen++; while ((de = readdir(dir)) != NULL) { struct stat st; if (is_dot_or_dotdot(de->d_name)) continue; - strcpy(name, de->d_name); + strlcpy(name, de->d_name, PATH_MAX - pathlen); if (lstat(pathbuf, &st)) die_errno("cannot lstat '%s'", pathbuf); if (S_ISDIR(st.st_mode)) @@ -244,8 +247,11 @@ int checkout_entry(struct cache_entry *ce, if (topath) return write_entry(ce, topath, state, 1); + if (len > PATH_MAX + 1) + die("Too long path: %s", state->base_dir); + memcpy(path, state->base_dir, len); - strcpy(path + len, ce->name); + strlcpy(path + len, ce->name, PATH_MAX + 1 - len); len += ce_namelen(ce); if (!check_path(path, len, &st, state->base_dir_len)) { diff --git a/unpack-trees.c b/unpack-trees.c index 1a61e6f..85473b1 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -918,6 +918,8 @@ static int clear_ce_flags_1(struct cache_entry **cache, int nr, int processed; len = slash - name; + if (len + prefix_len >= PATH_MAX) + len = PATH_MAX - prefix_len - 1; memcpy(prefix + prefix_len, name, len); /* -- 1.8.4.1.507.g9768648.dirty -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: separate-git-dir doesn't work with mapped drive
On Sat, Oct 19, 2013 at 11:49:27AM +0300, Ain Valtin wrote: > Hi > > I want to use git in a VirtualBox guest so that the repository is on > the host drive. So in the VB settings for the guest I set up a shared > folder "gitRepos" to /home/ain with full access rights. Then in the > guest OS (Windows XP) I map this shared folder as G drive. Now in the > project dir I execute > > C:\...\TPP>git init --separate-git-dir g:/TPP > Initialized empty Git repository in g:/TPP/ > > Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can > write to the mapped dir) and in the .git file created to the project > dir there is line > > gitdir: g:/TPP > > However when tring to use the repo it fails to recognise the g:/TPP path, ie > > C:\...\TPP>git add . > fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument > > Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too: > > C:\...\TPP>git add . > fatal: Unable to create 'C:/Documents and > Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock': > No such file or directory > > Am I doing something wrong or is it a bug? Any idea how to get it to work? > > BTW the VB is 3.0.14 ie rather old version but it seems that this > isn't the problem as the git init recognises the mapped drive but > other commands fail. > git version is 1.8.4.msysgit.0 > > > TIA > ain > -- > To unsubscribe from this list: send the line "unsubscribe git" in > the body of a message to majord...@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html Git has a bad history with mapped drives in windows. It's also usually a bad idea to use git over the network (and most mapped drives are over the network and not local between virt. machines). I would advise not to use this setup since for the past two years that git has sometime worked and sometimes not with this setup. (It's not just seperate git dir, a git dir at all over a smb share have been problematic). (This is probably something we should have in a test-suite somewhere.) -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iv...@iveqy.com -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: separate-git-dir doesn't work with mapped drive
On Sat, Oct 19, 2013 at 2:11 PM, Fredrik Gustafsson wrote: > > Git has a bad history with mapped drives in windows. It's also usually a > bad idea to use git over the network (and most mapped drives are over > the network and not local between virt. machines). > > I would advise not to use this setup since for the past two years that > git has sometime worked and sometimes not with this setup. (It's not > just seperate git dir, a git dir at all over a smb share have been > problematic). That's a shame :( As I wrote I want to use git in a virtual machine and as a extra precaution it would be nice to have the repo outside of the VM, on the host drive - should the VM not to start up for whatever reason I wouldn't lose my repo with it... ain -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: separate-git-dir doesn't work with mapped drive
On Sat, Oct 19, 2013 at 02:45:46PM +0300, Ain Valtin wrote: > On Sat, Oct 19, 2013 at 2:11 PM, Fredrik Gustafsson wrote: > > > > Git has a bad history with mapped drives in windows. It's also usually a > > bad idea to use git over the network (and most mapped drives are over > > the network and not local between virt. machines). > > > > I would advise not to use this setup since for the past two years that > > git has sometime worked and sometimes not with this setup. (It's not > > just seperate git dir, a git dir at all over a smb share have been > > problematic). > > That's a shame :( > As I wrote I want to use git in a virtual machine and as a extra > precaution it would be nice to have the repo outside of the VM, on the > host drive - should the VM not to start up for whatever reason I > wouldn't lose my repo with it... > > > ain A better way (if you can afford it) is to have a repo on the virtual machine and push to a repo on your hostmachine (so that you've two repos). However your solution "should" work but I personally have had some problems with that area. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iv...@iveqy.com -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: GSoC 2014: Summary so far, discussion starter: how to improve?
Thomas Rast wrote: > * Diminishing returns: Git is too mature, with little low-hanging > fruit left, making such projects harder Too mature? Aren't there other projects that are "too mature" as well? In particular I'm thinking about the Linux kernel. I think it's not about maturity, but how they deal with change. > * Does libgit2 want to remain under the Git umbrella, or participate > on its own? Personally I don't see what libgit2 has to do with git.git. There isn't even communication between the two projects. -- Felipe Contreras -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2 00/14] Officially start moving to the term 'staging area'
From: "Karsten Blees" Am 15.10.2013 00:29, schrieb Felipe Contreras: tl;dr: everyone except Junio C Hamano and Drew Northup agrees; we should move away from the name "the index". It has been discussed many times in the past that 'index' is not an appropriate description for what the high-level user does with it, and it has been agreed that 'staging area' is the best term. I haven't followed the previous discussion, but if a final conclusion towards 'staging area' has already been reached, it should probably be revised. Do you mean that how that conclusion was reached should be summarised, or that you don't think it's an appropriate summary of the broader weltanschauung? The term 'staging area' is more intuitive for newcomers which are more familiar with English than with Git, and it seems to be a straightforward mental notion for people with different mother tongues. In fact it is so intuitive that it's used already in a lot online documentation, and the people that do teach Git professionally use this term, because it's easier for many kinds of audiences to grasp. Such online documentation often portraits the 'staging area' as some supposedly 'unique' git feature, which I find _very_ confusing. In fact, every major SCM has a staging area. E.g. you first need to "svn/hg/bzr/p4 add/remove/rename/move" a file, which is somehow recorded in the working copy. These recorded changes will then be picked up by a subsequent "svn/hg/bzr/p4 commit/submit". Additionally, all those systems support selectively committing individual files (by specifying the files on the commit command line or selecting them in a GUI). So git's 'unique staging area' boils down to this: 1.) Recording individual files to commit in advance (instead of specifying them at commit time). Which isn't that hard to grasp. For many, that separation of preparation(s), from the final action, is brand new and difficult to appreciate - it's special to computer systems (where copying is 100% reliable, essentially instantaneous, and in Git's case, 100% verifiable via crypto checksums). 2.) Recording individual diff hunks or even lines to commit. Which is an advanced feature that requires even more advanced commands to be useful (i.e. stash save --keep-index; make; test; commit; stash pop). It is also entirely irrelevant to binary files (i.e. for non-technical users that use git to track documents and stuff). index: an 'index' is a guide of pointers to something else; a book index has a list of entries so the reader can locate information easily without having to go through the whole book. Git porcelain is not using the staging area to find out entries quicker; it's not an index. The 'staging area' is a sorted list of most recently checked out files, and its primary purpose is to quickly detect changes in the working copy (i.e. its an index). There is a big (human) problem here. We (humans) are able to believe contradictory things ("He ain't heavy, he's my brother" to quote a song). The Index (file) isn't a staging area, but we are happy to flip flop between the two ideas depending on context - others can feel confused. In one sense the "Index" is an implementation detail of the concept of a packing area where a shipment (commit) is prepared, which is most commonly called the staging are in populist discussions (which I believe is the summary I mentioned above) Of the 28 builtin main porcelain commands, 18 read the index (read_index), and 11 of those also check the state of the working copy (ie_match_stat). Incidentally, the latter include _all_ commands that update the so-called 'staging area'. Subversion until recently kept the checked out files scattered in **/.svn/text-base directories, and many operations were terribly slow due to this. Subversion 1.7 introduced a new working copy format, based on a database in the root .svn directory (i.e. an index), leading to tremendous performance improvements. The git index is a major feature that facilitates the incredible performance we're so much addicted to...why be shy about it and call it something else? stage: a 'stage' is a special area designated for convenience in order for some activity to take place; an orator would prepare a stage in order for her speak to be successful, otherwise many people might not be able to hear, or see her. Git porcelain is using the staging area precisely as a special area to be separated from the working directory for convenience. I'm not a native speaker, but in my limited understanding, 'staging' in computer jargon is the process of preparing data for a production system (i.e. until the 'stage' or 'state' of the data is ready for production). It has nothing to do with the 'stage' in a theater. I've never heard the term 'staging' beeing used for source code or software (that would be 'integration'). I've also never heard 'staging' for moving data back from a production system to some work- or development area. Even 'native' sp
Re: pack corruption post-mortem
On Sat, 19 Oct 2013, Duy Nguyen wrote: > On Wed, Oct 16, 2013 at 3:34 PM, Jeff King wrote: > > I was recently presented with a repository with a corrupted packfile, > > and was asked if the data was recoverable. This post-mortem describes > > the steps I took to investigate and fix the problem. I thought others > > might find the process interesting, and it might help somebody in the > > same situation. > > > > I started with an fsck, which found a problem with exactly one object > > (I've used $pack and $obj below to keep the output readable, and also > > because I'll refer to them later): > > > > $ git fsck > > error: $pack SHA1 checksum mismatch > > error: index CRC mismatch for object $obj from $pack at offset 51653873 > > error: inflate: data stream error (incorrect data check) > > error: cannot unpack $obj from $pack at offset 51653873 > > I wonder if we should protect the sha-1 and pathname tables in packv4 > with CRC too. A bit flipped in there could cause stream of corrupt > objects and make it hard to pinpoint the corrupt location.. It turns out that we already have this covered. The SHA1 used in the name of the pack file is actually the SHA1 checksum of the SHA1 table. The path and ident tables are already protected by the CRC32 in the zlib deflated stream. Normal objects are also zlib deflated (except for their header) but you need to inflate them in order to have this CRC verified, which the pack data copy tries to avoid. Hence the separate CRC32 in the index file in that case. However the pack v4 tables are very unlikely to be reused as is from one pack to another. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] send-pack: don't send a thin pack when the server doesn't support it
On Tue, 2013-10-08 at 15:22 -0700, Jonathan Nieder wrote: > Duy Nguyen wrote: > > > Or maybe it's not that late. How about you go with your patch and add > > thin-pack capability to receive-pack too? > > > > When new "git push" is used against old server, thin pack is disabled. > > But that's not a big deal (I hope). > > Could we have separate patches to introduce the server-side capability > and then to request it in the client? That way, people with old > servers can apply the patch introducing the capability if they want. That could work. > > The new meaning of the "thin-pack" capability should also be > documented in Documentation/technical/protocol-capabilities.txt. Oh, right; the capability as described is only about the server being able to generate a thin pack. Wouldn't his mean that git shouldn't assume that *any* remote can fix thin packs, though? (other than most servers you do talk to happen to). Anyway, facts on the ground and all that. I'll prepare some > > Done that way and with enough time between the server advertising the > capability and the client looking for it, it seems like a good idea. If such patches would be accepted, that would be great. By the time this all gets merged, we might have thin pack fixing merged into libgit2, but there will still be uses where fixing them isn't an issue due to other constraints. Cheers, cmn -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: separate-git-dir doesn't work with mapped drive
On 2013-10-19 10.49, Ain Valtin wrote: > Hi > > I want to use git in a VirtualBox guest so that the repository is on > the host drive. So in the VB settings for the guest I set up a shared > folder "gitRepos" to /home/ain with full access rights. Then in the > guest OS (Windows XP) I map this shared folder as G drive. Now in the > project dir I execute > > C:\...\TPP>git init --separate-git-dir g:/TPP > Initialized empty Git repository in g:/TPP/ > > Checked, the repo structure is in the "g:/TPP/" (thus the guest OS can > write to the mapped dir) and in the .git file created to the project > dir there is line > > gitdir: g:/TPP > > However when tring to use the repo it fails to recognise the g:/TPP path, ie > > C:\...\TPP>git add . > fatal: unable to access '../../../../../../g:/TPP/config': Invalid argument > > Also tryed "gitdir: //VBOXSVR/gitRepos/TPP" but this fails too: > > C:\...\TPP>git add . > fatal: Unable to create 'C:/Documents and > Settings/Ain/prog/AVT/TPP/../../../../../..///VBOXSVR/gitRepos/TPP/index.lock': > No such file or directory > > Am I doing something wrong or is it a bug? Any idea how to get it to work? > > BTW the VB is 3.0.14 ie rather old version but it seems that this > isn't the problem as the git init recognises the mapped drive but > other commands fail. > git version is 1.8.4.msysgit.0 > > > TIA > ain (This could go to msysgit mailing list, so I add a CC) (and I think, it is a known issue) As a work around, could you try this: > Downgrading to msysgit 1.8.3 fixes my problem. and tell us if this helps you? /Torsten -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pack corruption post-mortem
On Sat, Oct 19, 2013 at 7:41 AM, Nicolas Pitre wrote: > On Sat, 19 Oct 2013, Duy Nguyen wrote: > >> On Wed, Oct 16, 2013 at 3:34 PM, Jeff King wrote: >> > I was recently presented with a repository with a corrupted packfile, >> > and was asked if the data was recoverable. This post-mortem describes >> > the steps I took to investigate and fix the problem. I thought others >> > might find the process interesting, and it might help somebody in the >> > same situation. >> > >> > I started with an fsck, which found a problem with exactly one object >> > (I've used $pack and $obj below to keep the output readable, and also >> > because I'll refer to them later): >> > >> > $ git fsck >> > error: $pack SHA1 checksum mismatch >> > error: index CRC mismatch for object $obj from $pack at offset 51653873 >> > error: inflate: data stream error (incorrect data check) >> > error: cannot unpack $obj from $pack at offset 51653873 >> >> I wonder if we should protect the sha-1 and pathname tables in packv4 >> with CRC too. A bit flipped in there could cause stream of corrupt >> objects and make it hard to pinpoint the corrupt location.. > > It turns out that we already have this covered. > > The SHA1 used in the name of the pack file is actually the SHA1 checksum > of the SHA1 table. I continue to believe this naming is wrong. The pack file name should be the SHA1 checksum of the pack data stream, but the SHA1 table. This would allow cleaner update of a repository that was repacked with different compression settings, but identical objects. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3 10/11] read-cache.c: fix memory leaks caused by removed cache entries
Karsten Blees writes: > When cache_entry structs are removed from index_state.cache, they are not > properly freed. Freeing those entries wasn't possible before because we > couldn't remove them from index_state.name_hash. > > Now that we _do_ remove the entries from name_hash, we can also free them. > Add free(cache_entry) to all call sites of name-hash.c::remove_name_hash in > read-cache.c, as name-hash.c isn't concerned with cache_entry allocation. > > cmd_rm and unmerge_index_entry_at use cache_entry.name after removing the > entry. Copy the name so that we don't access memory that has been freed. Is this the version that is currently in pu? There's a valgrind test failure in current pu that bisects to 36850edb, which would seem to be from this email but doesn't have the right author date. Worse, I cannot apply this on top of 36850edb^ because I don't have the 'index' blobs for this patch. Confusing. In any case 36850edb currently breaks several valgrind tests. You can valgrind only t6022.16 like so (that one test is sufficient to track it down and it's much faster that way): cd t ./t6022-merge-rename.sh --valgrind-only=16 The valgrind error in t6022.16 looks like this: ==4959== Invalid read of size 1 ==4959==at 0x5682A38: vfprintf (vfprintf.c:1629) ==4959==by 0x56AC564: vsnprintf (vsnprintf.c:119) ==4959==by 0x542005: vreportf (usage.c:12) ==4959==by 0x54216C: error_builtin (usage.c:42) ==4959==by 0x54261B: error (usage.c:147) ==4959==by 0x4FC681: read_index_unmerged (read-cache.c:1900) ==4959==by 0x475CF1: reset_index (reset.c:68) ==4959==by 0x476A72: cmd_reset (reset.c:346) ==4959==by 0x405999: run_builtin (git.c:314) ==4959==by 0x405B2C: handle_internal_command (git.c:477) ==4959==by 0x405C46: run_argv (git.c:523) ==4959==by 0x405DE2: main (git.c:606) ==4959== Address 0x5bedb54 is 84 bytes inside a block of size 104 free'd ==4959==at 0x4C2ACDA: free (vg_replace_malloc.c:468) ==4959==by 0x4F9360: remove_index_entry_at (read-cache.c:482) ==4959==by 0x4FA469: add_index_entry_with_check (read-cache.c:964) ==4959==by 0x4FA5A4: add_index_entry (read-cache.c:993) ==4959==by 0x4FC663: read_index_unmerged (read-cache.c:1899) ==4959==by 0x475CF1: reset_index (reset.c:68) ==4959==by 0x476A72: cmd_reset (reset.c:346) ==4959==by 0x405999: run_builtin (git.c:314) ==4959==by 0x405B2C: handle_internal_command (git.c:477) ==4959==by 0x405C46: run_argv (git.c:523) ==4959==by 0x405DE2: main (git.c:606) If you need any more information/help, just ask :-) -- Thomas Rast t...@thomasrast.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: separate-git-dir doesn't work with mapped drive
> > (This could go to msysgit mailing list, so I add a CC) > (and I think, it is a known issue) > > As a work around, could you try this: >> Downgrading to msysgit 1.8.3 fixes my problem. > and tell us if this helps you? Yes, 1.8.3 seems to work (did initial commit successfully). ain -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/2] Revert "test-lib: support running tests under valgrind in parallel"
This reverts commit ad0e6233320b004f0d686f6887c803e508607bd2. --valgrind-parallel was broken from the start: during review I made the whole valgrind setup code conditional on not being a --valgrind-parallel worker child. But even the children crucially need $GIT_VALGRIND to be set; it should therefore have been set outside the conditional. The fix would be a two-liner, but since the introduction of the feature, almost four months have passed without anyone noticing that it is broken. So this feature is not worth the about hundred lines of test-lib.sh complexity. Revert it. Signed-off-by: Thomas Rast --- t/test-lib.sh | 106 -- 1 file changed, 22 insertions(+), 84 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index 0fa7dfd..eaf6759 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -205,15 +205,6 @@ do --valgrind-only=*) valgrind_only=$(expr "z$1" : 'z[^=]*=\(.*\)') shift ;; - --valgrind-parallel=*) - valgrind_parallel=$(expr "z$1" : 'z[^=]*=\(.*\)') - shift ;; - --valgrind-only-stride=*) - valgrind_only_stride=$(expr "z$1" : 'z[^=]*=\(.*\)') - shift ;; - --valgrind-only-offset=*) - valgrind_only_offset=$(expr "z$1" : 'z[^=]*=\(.*\)') - shift ;; --tee) shift ;; # was handled already --root=*) @@ -227,7 +218,7 @@ do esac done -if test -n "$valgrind_only" || test -n "$valgrind_only_stride" +if test -n "$valgrind_only" then test -z "$valgrind" && valgrind=memcheck test -z "$verbose" && verbose_only="$valgrind_only" @@ -377,9 +368,7 @@ maybe_teardown_verbose () { last_verbose=t maybe_setup_verbose () { test -z "$verbose_only" && return - if match_pattern_list $test_count $verbose_only || - { test -n "$valgrind_only_stride" && - expr $test_count "%" $valgrind_only_stride - $valgrind_only_offset = 0 >/dev/null; } + if match_pattern_list $test_count $verbose_only then exec 4>&2 3>&1 # Emit a delimiting blank line when going from @@ -403,7 +392,7 @@ maybe_teardown_valgrind () { maybe_setup_valgrind () { test -z "$GIT_VALGRIND" && return - if test -z "$valgrind_only" && test -z "$valgrind_only_stride" + if test -z "$valgrind_only" then GIT_VALGRIND_ENABLED=t return @@ -412,10 +401,6 @@ maybe_setup_valgrind () { if match_pattern_list $test_count $valgrind_only then GIT_VALGRIND_ENABLED=t - elif test -n "$valgrind_only_stride" && - expr $test_count "%" $valgrind_only_stride - $valgrind_only_offset = 0 >/dev/null - then - GIT_VALGRIND_ENABLED=t fi } @@ -568,9 +553,6 @@ test_done () { esac } - -# Set up a directory that we can put in PATH which redirects all git -# calls to 'valgrind git ...'. if test -n "$valgrind" then make_symlink () { @@ -618,42 +600,33 @@ then make_symlink "$symlink_target" "$GIT_VALGRIND/bin/$base" || exit } - # In the case of --valgrind-parallel, we only need to do the - # wrapping once, in the main script. The worker children all - # have $valgrind_only_stride set, so we can skip based on that. - if test -z "$valgrind_only_stride" - then - # override all git executables in TEST_DIRECTORY/.. - GIT_VALGRIND=$TEST_DIRECTORY/valgrind - mkdir -p "$GIT_VALGRIND"/bin - for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-* - do - make_valgrind_symlink $file - done - # special-case the mergetools loadables - make_symlink "$GIT_BUILD_DIR"/mergetools "$GIT_VALGRIND/bin/mergetools" - OLDIFS=$IFS - IFS=: - for path in $PATH + # override all git executables in TEST_DIRECTORY/.. + GIT_VALGRIND=$TEST_DIRECTORY/valgrind + mkdir -p "$GIT_VALGRIND"/bin + for file in $GIT_BUILD_DIR/git* $GIT_BUILD_DIR/test-* + do + make_valgrind_symlink $file + done + # special-case the mergetools loadables + make_symlink "$GIT_BUILD_DIR"/mergetools "$GIT_VALGRIND/bin/mergetools" + OLDIFS=$IFS + IFS=: + for path in $PATH + do + ls "$path"/git-* 2> /dev/null | + while read file do - ls "$path"/git-* 2> /dev/null | - while read file - do - make_valgrind_symlink "$file" - done + make_valgrind_symlink "$file" done - IFS=$OLDIFS - fi + done + IFS=$OLDIFS PATH=$GI
[PATCH 0/2] Revert --valgrind-parallel test option
These patches remove the --valgrind-parallel=N option that was broken from the outset (shame on me). Peff's judgement at the time that its usefulness would approximately be "meh" turns out to be correct. What's not in the commit message, but drives part of my reasoning in doing a revert instead of a fix: I did fix it up locally only to notice that it was too slow in this case for what I actually wanted to use it for. The only valgrind-test workflow that I find bearable is to run all the tests in the background under prove (takes hours), and then use the prove output (which says exactly which subtests fail) in --valgrind-only=. So the latter is -- again Peff was right -- the really useful thing. The only consolation is that I apparently didn't break any other use of the test suite -- otherwise it would presumably have been fixed very quickly. Thomas Rast (2): Revert "test-lib: support running tests under valgrind in parallel" Revert "test-lib: allow prefixing a custom string before "ok N" etc." t/test-lib.sh | 133 +++--- 1 file changed, 34 insertions(+), 99 deletions(-) -- 1.8.4.1.810.g312044e -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] Revert "test-lib: allow prefixing a custom string before "ok N" etc."
Now that ad0e623 (test-lib: support running tests under valgrind in parallel, 2013-06-23) has been reverted, this support code has no users any more. Revert it, too. This reverts commit e939e15d241e942662b9f88f6127ab470ab0a0b9. --- t/test-lib.sh | 27 --- 1 file changed, 12 insertions(+), 15 deletions(-) diff --git a/t/test-lib.sh b/t/test-lib.sh index eaf6759..29c1410 100644 --- a/t/test-lib.sh +++ b/t/test-lib.sh @@ -210,9 +210,6 @@ do --root=*) root=$(expr "z$1" : 'z[^=]*=\(.*\)') shift ;; - --statusprefix=*) - statusprefix=$(expr "z$1" : 'z[^=]*=\(.*\)') - shift ;; *) echo "error: unknown test option '$1'" >&2; exit 1 ;; esac @@ -320,12 +317,12 @@ trap 'die' EXIT test_ok_ () { test_success=$(($test_success + 1)) - say_color "" "${statusprefix}ok $test_count - $@" + say_color "" "ok $test_count - $@" } test_failure_ () { test_failure=$(($test_failure + 1)) - say_color error "${statusprefix}not ok $test_count - $1" + say_color error "not ok $test_count - $1" shift echo "$@" | sed -e 's/^/# /' test "$immediate" = "" || { GIT_EXIT_OK=t; exit 1; } @@ -333,12 +330,12 @@ test_failure_ () { test_known_broken_ok_ () { test_fixed=$(($test_fixed+1)) - say_color error "${statusprefix}ok $test_count - $@ # TODO known breakage vanished" + say_color error "ok $test_count - $@ # TODO known breakage vanished" } test_known_broken_failure_ () { test_broken=$(($test_broken+1)) - say_color warn "${statusprefix}not ok $test_count - $@ # TODO known breakage" + say_color warn "not ok $test_count - $@ # TODO known breakage" } test_debug () { @@ -462,8 +459,8 @@ test_skip () { of_prereq=" of $test_prereq" fi - say_color skip >&3 "${statusprefix}skipping test: $@" - say_color skip "${statusprefix}ok $test_count # skip $1 (missing $missing_prereq${of_prereq})" + say_color skip >&3 "skipping test: $@" + say_color skip "ok $test_count # skip $1 (missing $missing_prereq${of_prereq})" : true ;; *) @@ -501,11 +498,11 @@ test_done () { if test "$test_fixed" != 0 then - say_color error "${statusprefix}# $test_fixed known breakage(s) vanished; please update test(s)" + say_color error "# $test_fixed known breakage(s) vanished; please update test(s)" fi if test "$test_broken" != 0 then - say_color warn "${statusprefix}# still have $test_broken known breakage(s)" + say_color warn "# still have $test_broken known breakage(s)" fi if test "$test_broken" != 0 || test "$test_fixed" != 0 then @@ -528,9 +525,9 @@ test_done () { then if test $test_remaining -gt 0 then - say_color pass "${statusprefix}# passed all $msg" + say_color pass "# passed all $msg" fi - say "${statusprefix}1..$test_count$skip_all" + say "1..$test_count$skip_all" fi test -d "$remove_trash" && @@ -544,8 +541,8 @@ test_done () { *) if test $test_external_has_tap -eq 0 then - say_color error "${statusprefix}# failed $test_failure among $msg" - say "${statusprefix}1..$test_count" + say_color error "# failed $test_failure among $msg" + say "1..$test_count" fi exit 1 ;; -- 1.8.4.1.810.g312044e -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: GSoC 2014: Summary so far, discussion starter: how to improve?
Hi, so I was a GSoC:er, I got some (most) of my code merged but didn't fully met my (personal) goals for the project. However I do passed in the eyes of Google. GSoC is _hard_. You end up feeling completely stupid over and over again. Git has hard standards. Beeing just a single programmer and/or just learnt programming in school, there's a lot of difference. I started with learning git (better), read documentation and looking at the codebase and still felt lost. After that I'd to learn communication skills, who to mail, when to mail, how to write a commit message, been real strict with codestyle, setting up a github account, configuring git in a "git contributor friendly way", etc. On Sat, Oct 19, 2013 at 08:09:33AM +0200, Thomas Rast wrote: > Theories > > > These are the hypotheses that I have heard (mostly in [1] and [2]) as > to what is bad about Git's prior GSoC participations. > > * Aiming far too high, focusing on cool/shiny projects with a large > impact. This also affects the students, who tend to cluster around > the largest, shiniest project suggestions. > > * Diminishing returns: Git is too mature, with little low-hanging > fruit left, making such projects harder > > * Projects are too political, progress depending on non-technical > arguments > > * Our mentors suck on various axes, notably not supporting students > enough in things that matter: > - smooth interaction with community > - ensure fast iteration/short cycles > - navigating the code base > > * Scope creep: projects tend to get blocked on some bigger > refactoring/restructuring task that was not in the original proposal > > > > * View GSoC much more as a lot of work than free labor Totally agree, GSoC is an investment for future labor, not labor. > > * Break projects into smaller, easier tasks > - They should individually be simple, quick things if the mentor did > them. > - Should be parallelizable so students don't have to block on reviews. I'd 5-6 smaller projects setup for the summer, I think I managed to do 2-3 of them. (I did however do everything I applied for). I really think it's an excellent idea. This also meant that while one patch waited for review, I'd other things to work on. > > * Mentoring improvements: > - Always have a co-mentor > - Focus on social aspects (who to Cc, etc.) > - Nominate separate "review mentors" to ensure fast review cycles I like the idea of review mentors. However bear in mind that you'll already have three people reviewing the patches (two mentors and Junio). We will not make it look like it's impossible to get things into git.git. > * Have students review some patches This would be excellent. That's a part that I thinks is very usefull and would easy students remaining with git. It's easier to review patches than to make them. As a last part I would say that GSoC learned me a lot. I'm good at git, I know test driven development, I learned shell, I got to play with a huge C-codebase for the first time and I learned open source projects, QA, etc. I would like to thank Jens and Heiko for good mentoring and a lot of patience! (as a sidenote, I did get extremly busy when the school started. I didn't even had time to fix a serious bug in my code (Jens had to clean up after me). However two years later I'd some time again and got a few patches in and I hope to get a few patches into git in the future too). A successful GSoC for git isn't a merged project but continued contribution to git (not necessairly in patches, but also in support and review). A successful GSoC for Google/student is a merged project. A successful GSoC for student is a great learning experience. -- Med vänliga hälsningar Fredrik Gustafsson tel: 0733-608274 e-post: iv...@iveqy.com -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v8] diff.c: keep arrow(=>) on show_stats()'s shortened filename part to make rename visible
Hello Thomas > Can you briefly describe what you changed in v7 and v8, both compared to > earlier versions and between v7 and v8? On v7, 's basename part is tried to kept. On v7, whole part is tried to kept. For example, in case below: parent_path{sourceDirectory => DestinationDirectory}path1/path2//longlongFilename.txt On v7, this can be like: …{...ceDirectory => …onDirectory}.../longlongFilename.txt On v8, it will be like: …{...irectory => …irectory}path1/path2/longlongFilename.txt This change is based on the review from Junio below. (I myself is not sure what is the better way.) On Oct 17, 2013, at 10:29 PM, Junio C Hamano wrote: > I am not sure if distributing the burden of truncation equally to > three parts so that the resulting pieces are of similar lengths is > really a good idea. Between these two > > {...SourceDirectory => ...nationDirectory}...ileThatWasMoved > {...ceDirectory => ...ionDirectory}nameOfTheFileThatWasMoved > > that attempt to show that the file nameOfTheFileThatWasMoved was > moved from the longSourceDirectory to the DestinationDirectory, the > latter is much more informative, I would think. On Oct 18, 2013, at 1:38 AM, Junio C Hamano wrote: > Yoshioka Tsuneo writes: > >> In the "[PATCH v7]", I changed to keep filename part of suffix to handle >> above case, but not always keep directory part because I feel totally >> keeping all part of long suffix including directory name may cause output >> like: >>…{… => …}…ongPath1/LongPath2/nameOfTheFileThatWasMoved >> And, above may be worse than: >> ...{...ceDirectory => …ionDirectory}.../nameOfTheFileThatWasMoved >> I think. > > I am not sure if I agree. > > Losing LongPath2 part may be more significant data loss than losing > a single bit that says the change is a rename, as the latter may not > quite tell us what these two directories were anyway. Also, I guess Junio might be suspicious to the idea to keep arrow("=>") itself, maybe ? = (From What's cooking in git.git (Oct 2013, #04; Fri, 18)) - diff.c: keep arrow(=>) on show_stats()'s shortened filename part to make rename visible Attempts to give more weight on the fact that a filepair represents a rename than showing substring of the actual path when diffstat lines are not wide enough. I am not sure if that is solving a right problem, though. = Thanks! --- Tsuneo Yoshioka (吉岡 恒夫) yoshiokatsu...@gmail.com On Oct 19, 2013, at 9:24 AM, Thomas Rast wrote: > Yoshioka Tsuneo writes: > >> "git diff -M --stat" can detect rename and show renamed file name like >> "foofoofoo => barbarbar". >> >> Before this commit, this output is shortened always by omitting left most >> part like "...foo => barbarbar". So, if the destination filename is too long, >> source filename putting left or arrow can be totally omitted like >> "...barbarbar", without including any of "foofoofoo =>". >> In such a case where arrow symbol is omitted, there is no way to know >> whether the file is renamed or existed in the original. >> >> Make sure there is always an arrow, like "...foo => ...bar". >> >> The output can contain curly braces('{','}') for grouping. >> So, in general, the output format is "{ => }" >> >> To keep arrow("=>"), try to omit as long as possible at first >> because later part or changing part will be the more important part. >> If it is not enough, shorten , trying to have the same >> maximum length. >> If it is not enough yet, omit . >> >> Signed-off-by: Tsuneo Yoshioka >> Test-added-by: Thomas Rast >> --- > > Can you briefly describe what you changed in v7 and v8, both compared to > earlier versions and between v7 and v8? > > It would be very nice if you could always include such a "patch > changelog" after the "---" above. git-am will ignore the text between > "---" and the diff, so you can write comments for the reviewers there > without creating noise in the commit message. > > Also, please keep reviewers in the Cc list for future discussion/patches > so that they will see them. > > -- > Thomas Rast > t...@thomasrast.ch -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: pack corruption post-mortem
On Sat, Oct 19, 2013 at 9:41 PM, Nicolas Pitre wrote: > On Sat, 19 Oct 2013, Duy Nguyen wrote: > The SHA1 used in the name of the pack file is actually the SHA1 checksum > of the SHA1 table. > > The path and ident tables are already protected by the CRC32 in the zlib > deflated stream. > > Normal objects are also zlib deflated (except for their header) but you > need to inflate them in order to have this CRC verified, which the pack > data copy tries to avoid. Hence the separate CRC32 in the index file in > that case. OK slight change in the subject, what about reading code (i.e. sha1_file.c)? With v2 crc32 is verified by object inflate code. With v4 trees or commits, because we store some (or all) data outside of the deflated stream, we will not benefit from crc32 verifcation previously done for all trees and commits. Should we perform explict crc32 check when reading v4 trees and commits (and maybe verify the sha-1 table too)? -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [msysGit] [PATCH] Prevent buffer overflows when path is too big
(may be s/path is too big/path is too long/ ?) On 19.10.13 12:52, Antoine Pelisse wrote: > Currently, most buffers created with PATH_MAX length, are not checked > when being written, and can overflow if PATH_MAX is not big enough to > hold the path. > > Fix that by using strlcpy() where strcpy() was used, and also run some > extra checks when copy is done with memcpy(). > > Reported-by: Wataru Noguchi > Signed-off-by: Antoine Pelisse > --- > abspath.c| 10 +++--- > diffcore-order.c | 2 +- > entry.c | 14 ++ > unpack-trees.c | 2 ++ > 4 files changed, 20 insertions(+), 8 deletions(-) > > diff --git a/abspath.c b/abspath.c > index 64adbe2..0e60ba4 100644 > --- a/abspath.c > +++ b/abspath.c > @@ -216,11 +216,15 @@ const char *absolute_path(const char *path) > const char *prefix_filename(const char *pfx, int pfx_len, const char *arg) > { > static char path[PATH_MAX]; > + > + if (pfx_len > PATH_MAX) I think this should be if (pfx_len > PATH_MAX-1) /* Keep 1 char for '\0' > + die("Too long prefix path: %s", pfx); > + > #ifndef GIT_WINDOWS_NATIVE > if (!pfx_len || is_absolute_path(arg)) > return arg; > memcpy(path, pfx, pfx_len); > - strcpy(path + pfx_len, arg); > + strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len); I'm not sure how to handle overlong path in general, there are several ways: a) Silently overwrite memory (with help of memcpy() and/or strcpy() b) Silently shorten the path using strlcpy() instead of strcpy() c) Avoid the overwriting and call die(). d) Prepare a longer buffer using xmalloc() Today we do a), this is not a good thing and the worst choice. A little side note: It would be good to have test cases for either b), c) or d). As PATH_MAX is OS dependend, we need both a main program written in c and a test case written in t/t.sh. Some existing code can be used for inspiration, e.g. test-wildmatch.c in combination with t/t3070-wildmatch.sh This willl allow us to reproduce the error, and define how git should behave. End of the side note, let's look closer at the suggested patch, implementing b) Silently shortening an overlong path like "/foo/bar/baz" could result something like "/foo/bar/ba" /* That filename may be part of the repo too */ or "/foo/bar/" /* This is a directory, not a file name */ In either case the end user has no idea why git choose another file name. And this could be hard to debug. After a couple of hours she/he may send a message asking for help to the mailing list, and we end up in more people doing debugging. c) Is much easier to debug: Git can not handle this situation, and we print out the parameters in die() I would prefer c) over b), make clear that git can't handle that situation. d) Would mean some more re-factoring: Check all callers to prefix_filename(). Some of them call xstrdup() after prefix_filename(), which mean that we could change prefix_filename() to always return new string which is long enough via xmalloc(), and not a static buffer. So we come to the next point (and this is my personal experience, so please don't get me wrong): how much time can you spend on this? If the answer is kind of "very little", I would go for c) Avoid the silent memory corruption, and say to the user "we can not handle this" If the answer is kind of "little", I would go for c) and a test program, covering all the different code path in abspath() (WHich may deserve a refactoring as well, since the code for GIT_WINDOWS_NATIVE is very similar to the non-GIT_WINDOWS_NATIVE) If the answer is kind of "more than little", a different strategie may be better: Start sending a patch for c) I think we have enough volunteers here for a review, so we can life without the test code. On top of that, some volunteer can develop d). So far I have only looked at abspath(), and your patch touches more places. I think more and more that calling die() with all information included why we call die() is a good starting point. It will allow the users to see what is going on. May be the repo can be re-arranged to use shorter path names than what we can handle. [snip] /Torsten -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [msysGit] [PATCH] Prevent buffer overflows when path is too big
On Sun, Oct 20, 2013 at 07:47:06AM +0200, Torsten Bögershausen wrote: > (may be s/path is too big/path is too long/ ?) > > On 19.10.13 12:52, Antoine Pelisse wrote: > > Currently, most buffers created with PATH_MAX length, are not checked > > when being written, and can overflow if PATH_MAX is not big enough to > > hold the path. > > > > Fix that by using strlcpy() where strcpy() was used, and also run some > > extra checks when copy is done with memcpy(). > > > > Reported-by: Wataru Noguchi > > Signed-off-by: Antoine Pelisse > > --- > > diff --git a/abspath.c b/abspath.c > > index 64adbe2..0e60ba4 100644 > > --- a/abspath.c > > +++ b/abspath.c > > @@ -216,11 +216,15 @@ const char *absolute_path(const char *path) > > const char *prefix_filename(const char *pfx, int pfx_len, const char *arg) > > { > > static char path[PATH_MAX]; Why do you need static there? > > + > > + if (pfx_len > PATH_MAX) > I think this should be > if (pfx_len > PATH_MAX-1) /* Keep 1 char for '\0' > > + die("Too long prefix path: %s", pfx); > > + > > #ifndef GIT_WINDOWS_NATIVE > > if (!pfx_len || is_absolute_path(arg)) > > return arg; > > memcpy(path, pfx, pfx_len); > > - strcpy(path + pfx_len, arg); > > + strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len); > > I'm not sure how to handle overlong path in general, there are several ways: > a) Silently overwrite memory (with help of memcpy() and/or strcpy() > b) Silently shorten the path using strlcpy() instead of strcpy() > c) Avoid the overwriting and call die(). > d) Prepare a longer buffer using xmalloc() > There is also e) modify allocation to place write protected page after buffer end. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [msysGit] [PATCH] Prevent buffer overflows when path is too big
On 20.10.13 08:05, Ondřej Bílka wrote: > On Sun, Oct 20, 2013 at 07:47:06AM +0200, Torsten Bögershausen wrote: >> (may be s/path is too big/path is too long/ ?) >> >> On 19.10.13 12:52, Antoine Pelisse wrote: >>> Currently, most buffers created with PATH_MAX length, are not checked >>> when being written, and can overflow if PATH_MAX is not big enough to >>> hold the path. >>> >>> Fix that by using strlcpy() where strcpy() was used, and also run some >>> extra checks when copy is done with memcpy(). >>> >>> Reported-by: Wataru Noguchi >>> Signed-off-by: Antoine Pelisse >>> --- >>> diff --git a/abspath.c b/abspath.c >>> index 64adbe2..0e60ba4 100644 >>> --- a/abspath.c >>> +++ b/abspath.c >>> @@ -216,11 +216,15 @@ const char *absolute_path(const char *path) >>> const char *prefix_filename(const char *pfx, int pfx_len, const char *arg) >>> { >>> static char path[PATH_MAX]; > > Why do you need static there? Good point. get_pathname() from path.c may be better. >>> + >>> + if (pfx_len > PATH_MAX) >> I think this should be >> if (pfx_len > PATH_MAX-1) /* Keep 1 char for '\0' >>> + die("Too long prefix path: %s", pfx); >>> + >>> #ifndef GIT_WINDOWS_NATIVE >>> if (!pfx_len || is_absolute_path(arg)) >>> return arg; >>> memcpy(path, pfx, pfx_len); >>> - strcpy(path + pfx_len, arg); >>> + strlcpy(path + pfx_len, arg, PATH_MAX - pfx_len); >> >> I'm not sure how to handle overlong path in general, there are several ways: >> a) Silently overwrite memory (with help of memcpy() and/or strcpy() >> b) Silently shorten the path using strlcpy() instead of strcpy() >> c) Avoid the overwriting and call die(). >> d) Prepare a longer buffer using xmalloc() >> > There is also > e) modify allocation to place write protected page after buffer end. Yes, I think this is what electric fence, DUMA or valgrind do: http://sourceforge.jp/projects/freshmeat_efence/ http://duma.sourceforge.net/ http://valgrind.sourceforge.net/ Theses are very good tools for developers, finding memory corruption (or other bugs like using uninitialized memory). One of the motivation I asked for test cases is that a git developer can run these test cases under valgrind and can verify that we are never out of range. For an end user a git "crash" caused by trying to write to a write protected page is better than silently corrupting memory. And a range check, followed by die(), is even easier to debug. For an end user. /Torsten -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html