Re: Something wrong with diff --color-words=regexp?
Am 20.02.2015 um 00:52 schrieb Mike Hommey: Hi, I was trying to use --color-words with a regex to check a diff, and it appears it displays things out of order. Am I misunderstanding what my regexp should be doing or is there a bug? $ git diff -U3 HEAD^ dom/base/nsDOMFileReader.cpp diff --git a/dom/base/nsDOMFileReader.cpp b/dom/base/nsDOMFileReader.cpp index 6267e0e..fa22590 100644 --- a/dom/base/nsDOMFileReader.cpp +++ b/dom/base/nsDOMFileReader.cpp @@ -363,7 +363,7 @@ nsDOMFileReader::DoReadData(nsIAsyncInputStream* aStream, uint64_t aCount) return NS_ERROR_OUT_OF_MEMORY; } if (mDataFormat != FILE_AS_ARRAYBUFFER) { - mFileData = (char *) moz_realloc(mFileData, mDataLen + aCount); + mFileData = (char *) realloc(mFileData, mDataLen + aCount); NS_ENSURE_TRUE(mFileData, NS_ERROR_OUT_OF_MEMORY); } $ git diff -U3 --color-words='[^ ()]' HEAD^ dom/base/nsDOMFileReader.cpp diff --git a/dom/base/nsDOMFileReader.cpp b/dom/base/nsDOMFileReader.cpp index 6267e0e..fa22590 100644 --- a/dom/base/nsDOMFileReader.cpp +++ b/dom/base/nsDOMFileReader.cpp @@ -363,7 +363,7 @@ nsDOMFileReader::DoReadData(nsIAsyncInputStream* aStream, uint64_t aCount) return NS_ERROR_OUT_OF_MEMORY; } if (mDataFormat != FILE_AS_ARRAYBUFFER) { mFileData = (char *moz_) realloc(mFileData, mDataLen + aCount); NS_ENSURE_TRUE(mFileData, NS_ERROR_OUT_OF_MEMORY); } Your regexp says that every character (with a few exceptions) by itself is a word. Your diff says that it deleted the words 'm', 'o', 'z', and '_'. So, that is not wrong. Furthermore, your regexp says that space, '(' and ')' are whitespace. Whitespace is *ignored* for computation of the word difference. Nevertheless, --color-word mode helpfully keeps the whitespace of the post-image to produce readable output. In doing so, it has to choose whether to keep the whitespace before or after a word. It chooses to keep it before a word. Hence, you see the whitespace sequence ') ' attached in front of 'r' (of 'realloc') instead of after '*'. So, the procedure is a matter of choice, which sometimes does not match expectations. Perhaps you meant to say --color-words='[^ ()]+' to split the diff text into longer words. -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] GSoC 2015 application
On Fri, Feb 20, 2015 at 06:35:09AM +0100, Michael Haggerty wrote: > On 02/18/2015 08:14 PM, Jeff King wrote: > > The response to my previous email was not overwhelming, but people did > > express some interest in Git doing GSoC this year. So I've started on > > the application, using last year's version as a template. > > Regretfully, I can't in good conscience volunteer to be a GSoC mentor > this year. I have too many other projects going on and don't see how I > can free up enough time to be a good mentor. Thanks for letting us know. I am somewhat in the same boat. I might be able to make time, but the bar that the student/project combo would have to clear would be quite high for me to agree to do so. This brings up an important issue. We cannot do GSoC without mentors. I had hoped that people populating the "ideas" list would volunteer to mentor for their projects. But so far the possibilities are: - Stefan - me, who has already promised to be stingy - Matthieu, who also cited time constraints - Junio, who contributed some project ideas, but who in the past has declined to mentor in order to remain impartial as the maintainer who evaluates student results (which I think is quite reasonable) So...basically 1 mentor and 2 reticent maybes? That doesn't look good. We are not committed to anything until we accept student proposals, of course. But I would not want to waste students' time in applying if it is not realistic for us to accept them. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] GSoC 2015 application
On Fri, Feb 20, 2015 at 10:26:15AM +0700, Duy Nguyen wrote: > On Thu, Feb 19, 2015 at 2:14 AM, Jeff King wrote: > > and the list of microprojects: > > > > http://git.github.io/SoC-2015-Microprojects.html > > > > There is debian bug 777690 [1] that's basically about making tag's > version sort aware about -rc, -pre suffixes. I imagine it would touch > versioncmp.c and builtin/tag.c (to retrieve the suffixes from config > file). > > [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=777690 I think that's a reasonable thing to work on, but it's too big for a microproject and too small for a GSoC. I think this could be an "extra credit" for the project to unify for-each-ref, "tag -l", and "branch -l", though. That will vastly enhance the supporting abilities the latter two (e.g., you could sort by taggerdate). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Feb 19, 2015 5:42 PM, David Turner wrote: > > On Fri, 2015-02-20 at 06:38 +0700, Duy Nguyen wrote: > > > * 'git push'? > > > > This one is not affected by how deep your repo's history is, or how > > wide your tree is, so should be quick.. > > > > Ah the number of refs may affect both git-push and git-pull. I think > > Stefan knows better than I in this area. > > I can tell you that this is a bit of a problem for us at Twitter. We > have over 100k refs, which adds ~20MiB of downstream traffic to every > push. > > I added a hack to improve this locally inside Twitter: The client sends > a bloom filter of shas that it believes that the server knows about; the > server sends only the sha of master and any refs that are not in the > bloom filter. The client uses its local version of the servers' refs > as if they had just been sent. This means that some packs will be > suboptimal, due to false positives in the bloom filter leading some new > refs to not be sent. Also, if there were a repack between the pull and > the push, some refs might have been deleted on the server; we repack > rarely enough and pull frequently enough that this is hopefully not an > issue. > > We're still testing to see if this works. But due to the number of > assumptions it makes, it's probably not that great an idea for general > use. Good to hear that others are starting to experiment with solutions to this problem! I hope to hear more updates on this. I have a prototype of a simpler, and I believe more robust solution, but aimed at a smaller use case I think. On connecting, the client sends a sha of all its refs/shas as defined by a refspec, which it also sends to the server, which it believes the server might have the same refs/shas values for. The server can then calculate the value of its refs/shas which meet the same refspec, and then omit sending those refs if the "verification" sha matches, and instead send only a confirmation that they matched (along with any refs outside of the refspec). On a match, the client can inject the local values of the refs which met the refspec and be guaranteed that they match the server's values. This optimization is aimed at the worst case scenario (and is thus the potentially best case "compression"), when the client and server match for all refs (a refs/* refspec) This is something that happens often on Gerrit server startup, when it verifies that its mirrors are up-to-date. One reason I chose this as a starting optimization, is because I think it is one use case which will actually not benefit from "fixing" the git protocol to only send relevant refs since all the refs are in fact relevant here! So something like this will likely be needed in any future git protocol in order for it to be efficient for this use case. And I believe this use case is likely to stick around. With a minor tweak, this optimization should work when replicating actual expected updates also by excluding the expected updating refs from the verification so that the server always sends their values since they will likely not match and would wreck the optimization. However, for this use case it is not clear whether it is actually even worth caring about the non updating refs? In theory the knowledge of the non updating refs can potentially reduce the amount of data transmitted, but I suspect that as the ref count increases, this has diminishing returns and mostly ends up chewing up CPU and memory in a vain attempt to reduce network traffic. Please do keep us up-to-date of your results, -Martin Qualcomm Innovation Center, Inc. The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative ProjectN�r��yb�X��ǧv�^�){.n�+ا���ܨ}���Ơz�&j:+v���zZ+��+zf���h���~i���z��w���?�&�)ߢf
Re: [RFH] GSoC 2015 application
On 02/18/2015 08:14 PM, Jeff King wrote: > The response to my previous email was not overwhelming, but people did > express some interest in Git doing GSoC this year. So I've started on > the application, using last year's version as a template. Regretfully, I can't in good conscience volunteer to be a GSoC mentor this year. I have too many other projects going on and don't see how I can free up enough time to be a good mentor. Michael -- Michael Haggerty mhag...@alum.mit.edu -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interested in helping open source friends on HP-UX?
On Thu, Feb 19, 2015 at 02:21:11PM +0100, Michael J Gruber wrote: > > It passes NO_ICONV through to the test suite, sets up a prerequisite, > > disables some test scripts which are purely about i18n (e.g., > > t3900-i18n-commit), and marks some of the scripts with one-off tests > > using the ICONV prereq. > > Hmm. I know we pass other stuff down, but is this really a good idea? It > relies on the fact that the git that we test was built with the options > from there. This assumptions breaks (with) GIT_TEST_INSTALLED, if not more. > > Basically, it may break as soon as we run the tests by other means than > "make", which is quite customary if you run single tests. > > (And we do pass config.mak down, me thinks, but NO_ICONV may come from > the command line.) It's not quite so bad as you make out. We write the value to the GIT-BUILD-OPTIONS file during "make", no matter where it comes from, and load that in test-lib.sh. So: make NO_ICONV=Nope cd t ./t3901-i18n-patch.sh works just fine (for this and for any of the other options we mark there). It won't work for GIT_TEST_INSTALLED, but that is not a new problem. Fundamentally you cannot expect to test a version built without option X without telling git _somehow_ that it was built that way. I suspect GIT_TEST_INSTALLED is not all that widely used, or somebody would have complained before. But if we really want to support it, I think the right thing is to bake GIT-BUILD-OPTIONS into the binary, so that "git --build-options" dumps it. It might also have value for debugging and forensics in general. > Jeff, you got it wrong. You should do the hard part and leave the easy > part to us! Oops. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] log --decorate: do not leak "commit" color into the next item
On Thu, Feb 19, 2015 at 10:02:12AM -0800, Junio C Hamano wrote: > Jeff King writes: > > > Yeah, I think this is a good fix. I had a vague feeling that we may have > > done this on purpose to let the decoration color "inherit" from the > > existing colors for backwards compatibility, but I don't think that > > could ever have worked (since color.decorate.* never defaulted to > > "normal"). > > Hmph, but that $gmane/191118 talks about giving bold to commit-color > and then expecting for decors to inherit the boldness, a wish I can > understand. But I do not necessarily agree with it---it relies on > that after "(" and ", " there is no reset, > which is not how everything else works. I don't see anybody actually _wanting_ the inheritance. It is mentioned merely as an observation. So yeah, we would break anybody who does: [color "diff"] commit = blue [color "decorate"] branch = normal remoteBranch = normal tag = normal stash = normal HEAD = normal and expects the "blue" to persist automatically. But given that this behaves in the opposite way of every other part of git's color handling, I think we can call it a bug, and people doing that are crazy (they should s/normal/blue/ in the latter config). > So this change at least needs to come with an explanation to people > who are used to and took advantage of this color attribute leakage, > definitely in the log message and preferrably to the documentation > that covers all the color.*. settings, I think. I'd agree it is worth a mention in the log (and possibly release notes), but I don't think it is worth polluting the documentation forever (though explaining that we never inherit might be worth doing, and that is perhaps what you meant). -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] GSoC 2015 application
On Thu, Feb 19, 2015 at 2:14 AM, Jeff King wrote: > and the list of microprojects: > > http://git.github.io/SoC-2015-Microprojects.html > There is debian bug 777690 [1] that's basically about making tag's version sort aware about -rc, -pre suffixes. I imagine it would touch versioncmp.c and builtin/tag.c (to retrieve the suffixes from config file). [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=777690 -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] GSoC 2015 application
On Thu, Feb 19, 2015 at 11:32:46AM +0100, Matthieu Moy wrote: > > I do need somebody to volunteer as backup admin. This doesn't need > > to involve any specific commitment, but is mostly about what to do if I > > get hit by a bus. > > If you promise me to try hard not to be hit by a bus and no one else > steps in, I can be the backup admin. Thanks. I need you to register and create a profile at: https://www.google-melange.com/gsoc/homepage/google/gsoc2015 and tell me your username (the information from last year does not carry forward automatically). Then I mark you as backup admin and (I think) you have to then accept. > Throwing out a few ideas for discussion, I can write something if people > agree. > > * "git bisect fixed/unfixed", to allow bisecting a fix instead of a > regression less painfully. There were already some proposed patches > ( > https://git.wiki.kernel.org/index.php/SmallProjectsIdeas#git_bisect_fix.2Funfixed > ), > so it shouldn't be too hard. Perhaps this item can be included in the > "git bisect --first-parent" idea (turning it into "git bisect > improvements"). That seems like a reasonable topic. I was about to say "but it's much more complicated than fix/unfixed..." but it looks like that wiki entry covers the past discussion (and reading and understanding that would be a first step for the student). I agree it's probably smaller than a full-summer project and can get lumped into the other bisect idea. > * Be nicer to the user on tracked/untracked merge conflicts > [...] Sounds OK to me, though I agree the merging of untracked files is a little controversial. There are also a lot of corner cases in merge-recursive, and I think still some documented cases where we can overwrite untracked files. Maybe a more encompassing project would be to organize and dig into some of those corner cases. > SoC-2015-Microprojects.md | 42 ++ > 1 file changed, 42 insertions(+) Thanks, applied, although... > +### Move ~/.git-credentials and ~/.git-credential-cache to ~/.config/git > + > +Most of git dotfiles can be located, at the user's option, in > +~/. or in ~/.config/git/, following the [XDG > +standard](http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html). > +~/.git-credentials and ~/.git-credential-cache are still hardcoded as > +~/., and should allow using the XDG directory layout too > +(~/.git-credentials could be allowed as ~/.config/git/credential and > +~/.git-credential-cache could be allowed as ~/.cache/git/credential, > +possibly modified by $XDG_CONFIG_HOME and $XDG_CACHE_HOME). > + > +Each of these files can be a microproject of its own. The suggested > +approach is: > + > +* See how XDG was implemented for other files (run "git log --grep > + XDG" in Git's source code) and read the XDG specification. > + > +* Implement and test the new behavior, without breaking compatibility > + with the old behavior. > + > +* Update the documentation I think these might be getting a little larger than "micro". That's OK if the student can handle it, but we may want to mark them as such. I'll leave it for now, though, as we have a bit more breathing room on the microprojects. > +### Add configuration options for some commonly used command-line options > + > +This includes: > + > +* git am -3 > + > +* git am -c > + > +Some people always run the command with these options, and would > +prefer to be able to activate them by default in ~/.gitconfig. The direction here seems reasonable, though I think we have mailinfo.scissors already, so "-c" may not be a good example. > +### Add more builtin patterns for userdiff > + > +"git diff" shows the function name corresponding to each hunk after > +the @@ ... @@ line. For common languages (C, HTML, Ada, Matlab, ...), > +the way to find the function name is built-in Git's source code as > +regular expressions (see userdiff.c). A few languages are common > +enough to deserve a built-in driver, but are not yet recognized. For > +example, CSS, shell. I am not sure that understanding the horrible regexes involved in some userdiff counts as "micro", but OK. :) -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/2] index-pack: kill union delta_base to save memory
Once we know the number of objects in the input pack, we allocate an array of nr_objects of struct delta_entry. On x86-64, this struct is 32 bytes long. The union delta_base, which is part of struct delta_entry, provides enough space to store either ofs-delta (8 bytes) or ref-delta (20 bytes). Notice that with "recent" Git versions, ofs-delta objects are preferred over ref-delta objects and ref-delta objects have no reason to be present in a clone pack. So in clone case we waste (20-8) * nr_objects bytes because of this union. That's about 38MB out of 100MB for deltas[] with 3.4M objects, or 38%. deltas[] would be around 62MB without the waste. This patch attempts to eliminate that. deltas[] array is split into two: one for ofs-delta and one for ref-delta. Many functions are also duplicated because of this split. With this patch, ofs_deltas[] array takes 51MB. ref_deltas[] should remain unallocated in clone case (0 bytes). This array grows as we see ref-delta. We save about half in clone case, or 25% of total bookkeeping. The saving is more than the calculation above because some padding in the old delta_entry struct is removed. ofs_delta_entry is 16 bytes, including the 4 bytes padding. That's 13MB for padding, but packing the struct could break platforms that do not support unaligned access. If someone on 32-bit is really low on memory and only deals with packs smaller than 2G, using 32-bit off_t would eliminate the padding and save 27MB on top. A note about ofs_deltas allocation. We could use ref_deltas memory allocation strategy for ofs_deltas. But that probably just adds more overhead on top. ofs-deltas are generally the majority (1/2 to 2/3) in any pack. Incremental realloc may lead to too many memcpy. And if we preallocate, say 1/2 or 2/3 of nr_objects initially, the growth rate of ALLOC_GROW() could make this array larger than nr_objects, wasting more memory. Brought-up-by: Matthew Sporleder Signed-off-by: Nguyễn Thái Ngọc Duy --- builtin/index-pack.c | 260 +++ 1 file changed, 160 insertions(+), 100 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 07b2c0c..eae41c4 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -28,11 +28,6 @@ struct object_stat { int base_object_no; }; -union delta_base { - unsigned char sha1[20]; - off_t offset; -}; - struct base_data { struct base_data *base; struct base_data *child; @@ -52,26 +47,28 @@ struct thread_local { int pack_fd; }; -/* - * Even if sizeof(union delta_base) == 24 on 64-bit archs, we really want - * to memcmp() only the first 20 bytes. - */ -#define UNION_BASE_SZ 20 - #define FLAG_LINK (1u<<20) #define FLAG_CHECKED (1u<<21) -struct delta_entry { - union delta_base base; +struct ofs_delta_entry { + off_t offset; + int obj_no; +}; + +struct ref_delta_entry { + unsigned char sha1[20]; int obj_no; }; static struct object_entry *objects; static struct object_stat *obj_stat; -static struct delta_entry *deltas; +static struct ofs_delta_entry *ofs_deltas; +static struct ref_delta_entry *ref_deltas; static struct thread_local nothread_data; static int nr_objects; -static int nr_deltas; +static int nr_ofs_deltas; +static int nr_ref_deltas; +static int ref_deltas_alloc; static int nr_resolved_deltas; static int nr_threads; @@ -480,7 +477,8 @@ static void *unpack_entry_data(unsigned long offset, unsigned long size, } static void *unpack_raw_entry(struct object_entry *obj, - union delta_base *delta_base, + off_t *ofs_offset, + unsigned char *ref_sha1, unsigned char *sha1) { unsigned char *p; @@ -509,11 +507,10 @@ static void *unpack_raw_entry(struct object_entry *obj, switch (obj->type) { case OBJ_REF_DELTA: - hashcpy(delta_base->sha1, fill(20)); + hashcpy(ref_sha1, fill(20)); use(20); break; case OBJ_OFS_DELTA: - memset(delta_base, 0, sizeof(*delta_base)); p = fill(1); c = *p; use(1); @@ -527,8 +524,8 @@ static void *unpack_raw_entry(struct object_entry *obj, use(1); base_offset = (base_offset << 7) + (c & 127); } - delta_base->offset = obj->idx.offset - base_offset; - if (delta_base->offset <= 0 || delta_base->offset >= obj->idx.offset) + *ofs_offset = obj->idx.offset - base_offset; + if (*ofs_offset <= 0 || *ofs_offset >= obj->idx.offset) bad_object(obj->idx.offset, _("delta base offset is out of bound")); break; case OBJ_COMMIT: @@ -612,55 +609,108 @@ static void *get_data_from_pack(struct object_entry *obj) return unpack_d
[PATCH 1/2] index-pack: reduce object_entry size to save memory
For each object in the input pack, we need one struct object_entry. On x86-64, this struct is 64 bytes long. Although: - The 8 bytes for delta_depth and base_object_no are only useful when show_stat is set. And it's never set unless someone is debugging. - The three fields hdr_size, type and real_type take 4 bytes each even though they never use more than 4 bits. By moving delta_depth and base_object_no out of struct object_entry and make the other 3 fields one byte long instead of 4, we shrink 25% of this struct. On a 3.4M object repo (*) that's about 53MB. The saving is less impressive compared to index-pack memory use for basic bookkeeping (**), about 16%. (*) linux-2.6.git already has 4M objects as of v3.19-rc7 so this is not an unrealistic number of objects that we have to deal with. (**) 3.4M * (sizeof(object_entry) + sizeof(delta_entry)) = 311MB Brought-up-by: Matthew Sporleder Signed-off-by: Nguyễn Thái Ngọc Duy --- builtin/index-pack.c | 30 +++--- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 4632117..07b2c0c 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -18,9 +18,12 @@ static const char index_pack_usage[] = struct object_entry { struct pack_idx_entry idx; unsigned long size; - unsigned int hdr_size; - enum object_type type; - enum object_type real_type; + unsigned char hdr_size; + char type; + char real_type; +}; + +struct object_stat { unsigned delta_depth; int base_object_no; }; @@ -64,6 +67,7 @@ struct delta_entry { }; static struct object_entry *objects; +static struct object_stat *obj_stat; static struct delta_entry *deltas; static struct thread_local nothread_data; static int nr_objects; @@ -873,13 +877,15 @@ static void resolve_delta(struct object_entry *delta_obj, void *base_data, *delta_data; if (show_stat) { - delta_obj->delta_depth = base->obj->delta_depth + 1; + int i = delta_obj - objects; + int j = base->obj - objects; + obj_stat[i].delta_depth = obj_stat[j].delta_depth + 1; deepest_delta_lock(); - if (deepest_delta < delta_obj->delta_depth) - deepest_delta = delta_obj->delta_depth; + if (deepest_delta < obj_stat[i].delta_depth) + deepest_delta = obj_stat[i].delta_depth; deepest_delta_unlock(); + obj_stat[i].base_object_no = j; } - delta_obj->base_object_no = base->obj - objects; delta_data = get_data_from_pack(delta_obj); base_data = get_base_data(base); result->obj = delta_obj; @@ -902,7 +908,7 @@ static void resolve_delta(struct object_entry *delta_obj, * "want"; if so, swap in "set" and return true. Otherwise, leave it untouched * and return false. */ -static int compare_and_swap_type(enum object_type *type, +static int compare_and_swap_type(char *type, enum object_type want, enum object_type set) { @@ -1499,7 +1505,7 @@ static void show_pack_info(int stat_only) struct object_entry *obj = &objects[i]; if (is_delta_type(obj->type)) - chain_histogram[obj->delta_depth - 1]++; + chain_histogram[obj_stat[i].delta_depth - 1]++; if (stat_only) continue; printf("%s %-6s %lu %lu %"PRIuMAX, @@ -1508,8 +1514,8 @@ static void show_pack_info(int stat_only) (unsigned long)(obj[1].idx.offset - obj->idx.offset), (uintmax_t)obj->idx.offset); if (is_delta_type(obj->type)) { - struct object_entry *bobj = &objects[obj->base_object_no]; - printf(" %u %s", obj->delta_depth, sha1_to_hex(bobj->idx.sha1)); + struct object_entry *bobj = &objects[obj_stat[i].base_object_no]; + printf(" %u %s", obj_stat[i].delta_depth, sha1_to_hex(bobj->idx.sha1)); } putchar('\n'); } @@ -1672,6 +1678,8 @@ int cmd_index_pack(int argc, const char **argv, const char *prefix) curr_pack = open_pack_file(pack_name); parse_pack_header(); objects = xcalloc(nr_objects + 1, sizeof(struct object_entry)); + if (show_stat) + obj_stat = xcalloc(nr_objects + 1, sizeof(struct object_stat)); deltas = xcalloc(nr_objects, sizeof(struct delta_entry)); parse_pack_objects(pack_sha1); resolve_deltas(); -- 2.3.0.rc1.137.g477eb31 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 0/2] nd/slim-index-pack-memory-usage updates
Compared to 'pu', the first patch is unchanged, except the commit message. The second patch has __attribute((packed)) removed because it causes problems on some ARM systems. x86 people who want to save more memory just have to put it back by themselves. Nguyễn Thái Ngọc Duy (2): index-pack: reduce object_entry size to save memory index-pack: kill union delta_base to save memory builtin/index-pack.c | 290 +++ 1 file changed, 179 insertions(+), 111 deletions(-) -- 2.3.0.rc1.137.g477eb31 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Fri, 2015-02-20 at 06:38 +0700, Duy Nguyen wrote: > >* 'git push'? > > This one is not affected by how deep your repo's history is, or how > wide your tree is, so should be quick.. > > Ah the number of refs may affect both git-push and git-pull. I think > Stefan knows better than I in this area. I can tell you that this is a bit of a problem for us at Twitter. We have over 100k refs, which adds ~20MiB of downstream traffic to every push. I added a hack to improve this locally inside Twitter: The client sends a bloom filter of shas that it believes that the server knows about; the server sends only the sha of master and any refs that are not in the bloom filter. The client uses its local version of the servers' refs as if they had just been sent. This means that some packs will be suboptimal, due to false positives in the bloom filter leading some new refs to not be sent. Also, if there were a repack between the pull and the push, some refs might have been deleted on the server; we repack rarely enough and pull frequently enough that this is hopefully not an issue. We're still testing to see if this works. But due to the number of assumptions it makes, it's probably not that great an idea for general use. There are probably more complex schemes to compute minimal (or small-enough) packs; in particular, if the patch is just a few megs off of master, it's better to just send the whole pack. That doesn't work for us because we've got a log-based replication scheme that the pack appends to, and we don't want the log to get too big; we want more-minimal packs than that. But it might work for others. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Fri, Feb 20, 2015 at 6:29 AM, Ævar Arnfjörð Bjarmason wrote: > Anecdotally I work on a repo at work (where I'm mostly "the Git guy") that's: > > * Around 500k commits > * Around 100k tags > * Around 5k branches > * Around 500 commits/day, almost entirely to the same branch > * 1.5 GB .git checkout. > * Mostly text source, but some binaries (we're trying to cut down[1] on > those) Would be nice if you could make an anonymized version of this repo public. Working on a "real" large repo is better than an artificial one. > But actually most of "git fetch" is spent in the reachability check > subsequently done by "git-rev-list" which takes several seconds. I I wonder if reachability bitmap could help here.. > haven't looked into it but there's got to be room for optimization > there, surely it only has to do reachability checks for new refs, or > could run in some "I trust this remote not to send me corrupt data" > completely mode (which would make sense within a company where you can > trust your main Git box). No, it's not just about trusting the server side, it's about catching data corruption on the wire as well. We have a trick to avoid reachability check in clone case, which is much more expensive than a fetch. Maybe we could do something further to help the fetch case _if_ reachability bitmaps don't help. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Thu, Feb 19, 2015 at 04:26:58PM -0500, Stephen Morton wrote: > I posted this to comp.version-control.git.user and didn't get any response. I > think the question is plumbing-related enough that I can ask it here. > > I'm evaluating the feasibility of moving my team from SVN to git. We have a > very > large repo. [1] We will have a central repo using GitLab (or similar) that > everybody works with. Forks, code sharing, pull requests etc. will be done > through this central server. > > By 'performance', I guess I mean speed of day to day operations for devs. > >* (Obviously, trivially, a (non-local) clone will be slow with a large > repo.) >* Will a few simultaneous clones from the central server also slow down > other concurrent operations for other users? This hasn't been a problem for us at $DAYJOB. Git doesn't lock anything on fetches, so each process is independent. We probably have about sixty developers (and maybe twenty other occasional users) that manage to interact with our Git server all day long. We also have probably twenty smoker (CI) systems pulling at two hour intervals, or, when there's nothing to do, every two minutes, plus probably fifteen to twenty build systems pulling hourly. I assume you will provide adequate resources for your server. >* Will 'git pull' be slow? >* 'git push'? The most pathological case I've seen for git push is a branch with a single commit merged into the main development branch. As of Git 2.3.0, the performance regression here is fixed. Obviously, the speed of your network connection will affect this. Even at 30 MB/s, cloning several gigabytes of data takes time. Git tries hard to eliminate sending a lot of data, so if your developers keep reasonably up-to-date, the cost of establishing the connection will tend to dominate. I see pull and push times that are less than 2 seconds in most cases. >* 'git commit'? (It is listed as slow in reference [3].) >* 'git stautus'? (Slow again in reference 3 though I don't see it.) These can be slow with slow disks or over remote file systems. I recommend not doing that. I've heard rumbles that disk performance is better on Unix, but I don't use Windows so I can't say. You should keep your .gitignore files up-to-date to avoid enumerating untracked files. There's some work towards making this less of an issue. git blame can be somewhat slow, but it's not something I use more than about once a day, so it doesn't bother me that much. > Assuming I can put lots of resources into a central server with lots of CPU, > RAM, fast SSD, fast networking, what aspects of the repo are most likely to > affect devs' experience? >* Number of commits >* Sheer disk space occupied by the repo The number of files can impact performance due to the number of stat()s required. >* Number of tags. >* Number of branches. The number of tags and branches individually is really less relevant than the total number of refs (tags, branches, remote branches, etc). Very large numbers of refs can impact performance on pushes and pulls due to the need to enumerate them all. >* Binary objects in the repo that cause it to bloat in size [1] >* Other factors? If you want good performance, I'd recommend the latest version of Git both client- and server-side. Newer versions of Git provide pack bitmaps, which can dramatically speed up clones and fetches, and Git 2.3.0 fixes a performance regression with large numbers of refs in non-shallow repositories. It is totally worth it to roll your own packages of git if your vendor provides old versions. > Of the various HW items listed above --CPU speed, number of cores, RAM, SSD, > networking-- which is most critical here? I generally find that having a good disk cache is important with large repositories. It may be advantageous to make sure the developer machines have adequate memory. Performance is notably better on development machines (VMs) with 2 GB or 4 GB of memory instead of 1 GB. I can't speak to the server side, as I'm not directly involved with its deployment. > Assume ridiculous numbers. Let me exaggerate: say 1 million commits, 15 GB > repo, > 50k tags, 1,000 branches. (Due to historical code fixups, another 5,000 > "fix-up > branches" which are just one little dangling commit required to change the > code > a little bit between a commit a tag that was not quite made from it.) I routinely work on a repo that's 1.9 GB packed, with 25k (and rapidly growing) refs. Other developers work on a repo that's 9 GB packed, with somewhat fewer refs. We don't tend to have problems with this. Obviously, performance is better on some of our smaller repos, but it's not unacceptable on the larger ones. I generally find that the 940 KB repo with huge numbers of files performs worse than the 1.9 GB repo with somewhat fewer. If you can split your repository into multiple logical repositories, that wil
Something wrong with diff --color-words=regexp?
Hi, I was trying to use --color-words with a regex to check a diff, and it appears it displays things out of order. Am I misunderstanding what my regexp should be doing or is there a bug? $ git diff -U3 HEAD^ dom/base/nsDOMFileReader.cpp diff --git a/dom/base/nsDOMFileReader.cpp b/dom/base/nsDOMFileReader.cpp index 6267e0e..fa22590 100644 --- a/dom/base/nsDOMFileReader.cpp +++ b/dom/base/nsDOMFileReader.cpp @@ -363,7 +363,7 @@ nsDOMFileReader::DoReadData(nsIAsyncInputStream* aStream, uint64_t aCount) return NS_ERROR_OUT_OF_MEMORY; } if (mDataFormat != FILE_AS_ARRAYBUFFER) { - mFileData = (char *) moz_realloc(mFileData, mDataLen + aCount); + mFileData = (char *) realloc(mFileData, mDataLen + aCount); NS_ENSURE_TRUE(mFileData, NS_ERROR_OUT_OF_MEMORY); } $ git diff -U3 --color-words='[^ ()]' HEAD^ dom/base/nsDOMFileReader.cpp diff --git a/dom/base/nsDOMFileReader.cpp b/dom/base/nsDOMFileReader.cpp index 6267e0e..fa22590 100644 --- a/dom/base/nsDOMFileReader.cpp +++ b/dom/base/nsDOMFileReader.cpp @@ -363,7 +363,7 @@ nsDOMFileReader::DoReadData(nsIAsyncInputStream* aStream, uint64_t aCount) return NS_ERROR_OUT_OF_MEMORY; } if (mDataFormat != FILE_AS_ARRAYBUFFER) { mFileData = (char *moz_) realloc(mFileData, mDataLen + aCount); NS_ENSURE_TRUE(mFileData, NS_ERROR_OUT_OF_MEMORY); } (This is with 2.3.0) Cheers, Mike -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFD/PATCH] stash: introduce checkpoint mode
On Feb 19, 2015, at 09:49, Junio C Hamano wrote: "Kyle J. McKay" writes: What about a shortcut to "reset-and-apply" as well? I have often been frustrated when "git stash apply" refuses to work because I have changes that would be stepped on and there's no -- force option like git checkout has. I end up doing a reset just so I can run stash apply. Doesn't that cut both ways, though? A single step short-cut, done in any way other than a more explicit way such as "git reset --hard && git stash apply" (e.g. "git stash reset-and-apply" or "git stash apply --force") that makes it crystal clear that the user _is_ discarding, has a risk of encouraging users to form a dangerous habit of invoking the short-cut without thinking and leading to "oops, I didn't mean that!". Does that reasoning not also apply to the plethora of commands that take "--force" already? I didn't check them all, but tag, checkout, push and branch immediately come to mind. Why is it okay for all those other commands to have a --force mode, but not git stash? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Fri, Feb 20, 2015 at 4:26 AM, Stephen Morton wrote: > By 'performance', I guess I mean speed of day to day operations for devs. > >* (Obviously, trivially, a (non-local) clone will be slow with a large > repo.) >* Will a few simultaneous clones from the central server also slow down > other concurrent operations for other users? There are no locks in server when cloning, so in theory cloning does not affect other operations. Cloning can use lots of memory though (and a lot of cpu unless you turn on reachability bitmap feature, which you should). >* Will 'git pull' be slow? If we exclude the server side, the size of your tree is the main factor, but your 25k files should be fine (linux has 48k files). >* 'git push'? This one is not affected by how deep your repo's history is, or how wide your tree is, so should be quick.. Ah the number of refs may affect both git-push and git-pull. I think Stefan knows better than I in this area. >* 'git commit'? (It is listed as slow in reference [3].) >* 'git stautus'? (Slow again in reference 3 though I don't see it.) (also git-add) Again, the size of your tree. I'm trying to address problems in [3], but at your repo's size, I don't think you need to worry about it. >* Some operations might not seem to be day-to-day but if they are called > frequently by the web front-end to GitLab/Stash/GitHub etc then > they can become bottlenecks. (e.g. 'git branch --contains' seems terribly > adversely affected by large numbers of branches.) >* Others? git-blame could be slow when a file is modified a lot. -- Duy -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Thu, Feb 19, 2015 at 10:26 PM, Stephen Morton wrote: > I posted this to comp.version-control.git.user and didn't get any response. I > think the question is plumbing-related enough that I can ask it here. > > I'm evaluating the feasibility of moving my team from SVN to git. We have a > very > large repo. [1] We will have a central repo using GitLab (or similar) that > everybody works with. Forks, code sharing, pull requests etc. will be done > through this central server. > > By 'performance', I guess I mean speed of day to day operations for devs. > >* (Obviously, trivially, a (non-local) clone will be slow with a large > repo.) >* Will a few simultaneous clones from the central server also slow down > other concurrent operations for other users? >* Will 'git pull' be slow? >* 'git push'? >* 'git commit'? (It is listed as slow in reference [3].) >* 'git stautus'? (Slow again in reference 3 though I don't see it.) >* Some operations might not seem to be day-to-day but if they are called > frequently by the web front-end to GitLab/Stash/GitHub etc then > they can become bottlenecks. (e.g. 'git branch --contains' seems terribly > adversely affected by large numbers of branches.) >* Others? > > > Assuming I can put lots of resources into a central server with lots of CPU, > RAM, fast SSD, fast networking, what aspects of the repo are most likely to > affect devs' experience? >* Number of commits >* Sheer disk space occupied by the repo >* Number of tags. >* Number of branches. >* Binary objects in the repo that cause it to bloat in size [1] >* Other factors? > > Of the various HW items listed above --CPU speed, number of cores, RAM, SSD, > networking-- which is most critical here? > > (Stash recommends 1.5 x repo_size x number of concurrent clones of > available RAM. > I assume that is good advice in general.) > > Assume ridiculous numbers. Let me exaggerate: say 1 million commits, 15 GB > repo, > 50k tags, 1,000 branches. (Due to historical code fixups, another 5,000 > "fix-up > branches" which are just one little dangling commit required to change the > code > a little bit between a commit a tag that was not quite made from it.) > > While there's lots of information online, much of it is old [3] and with git > constantly evolving I don't know how valid it still is. Then there's anecdotal > evidence that is of questionable value.[2] > Are many/all of the issues Facebook identified [3] resolved? (Yes, I > understand Facebook went with Mercurial. But I imagine the git team > nevertheless > took their analysis to heart.) Anecdotally I work on a repo at work (where I'm mostly "the Git guy") that's: * Around 500k commits * Around 100k tags * Around 5k branches * Around 500 commits/day, almost entirely to the same branch * 1.5 GB .git checkout. * Mostly text source, but some binaries (we're trying to cut down[1] on those) The main scaling issues we have with Git are: * "git pull" takes around 10 seconds or so * Operations like "git status" are much slower because they scale with the size of the work tree * Similarly "git rebase" takes a much longer time for each applied commit, I think because it does the equivalent of "git status" for every applied commit. Each commit applied takes around 1-2 seconds. * We have a lot of contention on pushes because we're mostly pushing to one branch. * History spelunking (e.g. git log --reverse -p -G) is taking longer by the day The obvious reason for why "git pull" is slow is because git-upload-pack spews the complete set of refs at you each time. The output from that command is around 10MB in size for us now. It takes around 300 ms to run that locally from hot cache, a bit more to send it over the network. But actually most of "git fetch" is spent in the reachability check subsequently done by "git-rev-list" which takes several seconds. I haven't looked into it but there's got to be room for optimization there, surely it only has to do reachability checks for new refs, or could run in some "I trust this remote not to send me corrupt data" completely mode (which would make sense within a company where you can trust your main Git box). The "git status" operations could be made faster by having something like watchman, there's been some effort on getting that done in Git, but I haven't tried it. This seems to have been the main focus of Facebook's Mercurial optimization effort. Some of this you can "solve" mostly by doing e.g. "git status -uno", having support for such unsafe operations (e.g. teaching rebase and pals to use it) would be nice at the cost of some safety, but having something that feeds of inotify would be even better. It takes around 3 minutes to reclone our repo, we really don't care (we rarely re-clone). But I thought I'd mention it because for some reason this is important to Facebook and along with inotify were the two major things they focused on. As f
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Thu, Feb 19, 2015 at 3:06 PM, Stephen Morton wrote: > > I think I addressed most of this in my original post with the paragraph > > "Assume ridiculous numbers. Let me exaggerate: say 1 million commits, > 15 GB repo, > 50k tags, 1,000 branches. (Due to historical code fixups, another > 5,000 "fix-up > branches" which are just one little dangling commit required to > change the code > a little bit between a commit and a tag that was not quite made from it.)" > > To that I'd add 25k files, > no major rewrites, > no huge binary files, but lots of a few MB binary files with many revisions. > > But even without details of my specific concerns, I thought that > perhaps the git developers know what limits git's performance even if > large projects like the kernel are not hitting these limits. > > Steve I did not realize you gave numbers below, as I started answering after reading the first paragraphs. Sorry about that. I think lots of files organized in a hierarchical fashion ranging in the small MB range is not a huge deal. Also history is a non issue The problem arises with having lots of branches. "640 git branches ought to be enough for everybody -- Linus" (just kidding) Git doesn't really scale efficiently with lots of branches (second hand information except for fetch/pull where I did some patches on another topic recently). Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Thu, Feb 19, 2015 at 5:21 PM, Stefan Beller wrote: > On Thu, Feb 19, 2015 at 1:26 PM, Stephen Morton > wrote: >> I posted this to comp.version-control.git.user and didn't get any response. I >> think the question is plumbing-related enough that I can ask it here. >> >> I'm evaluating the feasibility of moving my team from SVN to git. We have a >> very >> large repo. [1] >> >> [1] (Yes, I'm investigating ways to make our repo not so large etc. That's >> beyond the scope of the discussion I'd like to have with this >> question. Thanks.) > > What do you mean by large? > * lots of files > * large files > * or even large binary files (bad to diff/merge) > * long history (i.e. lots of small changes) > * impactful history (changes which rewrite nearly everything from scratch) > > For reference, the linux > * has 48414 files, in 3128 directories > * the largest file is 1.1M, the whole repo is 600M > * no really large binary files > * more than 500051 changes/commits including merges > * started in 2004 (when git was invented essentially) > * the .git folder is 1.4G compared to the 600M files, >indicating it may have been rewritting 3 times (well this >metric is bogus, there is lots of compression >going on in .git) > > and linux seems to be doing ok with git. > > So as long as you cannot pinpoint your question on what you are exactly > concerned about, there will be no helpful answer I guess. > > linux is by no means a really large project, there are other projects way > larger than that (I am thinking about the KDE project for example) > and they do fine as well. > > Thanks, > Stefan Hi Stefan, I think I addressed most of this in my original post with the paragraph "Assume ridiculous numbers. Let me exaggerate: say 1 million commits, 15 GB repo, 50k tags, 1,000 branches. (Due to historical code fixups, another 5,000 "fix-up branches" which are just one little dangling commit required to change the code a little bit between a commit and a tag that was not quite made from it.)" To that I'd add 25k files, no major rewrites, no huge binary files, but lots of a few MB binary files with many revisions. But even without details of my specific concerns, I thought that perhaps the git developers know what limits git's performance even if large projects like the kernel are not hitting these limits. Steve -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Scaling: What factors most affect Git performance for a large repo?
On Thu, Feb 19, 2015 at 1:26 PM, Stephen Morton wrote: > I posted this to comp.version-control.git.user and didn't get any response. I > think the question is plumbing-related enough that I can ask it here. > > I'm evaluating the feasibility of moving my team from SVN to git. We have a > very > large repo. [1] > > [1] (Yes, I'm investigating ways to make our repo not so large etc. That's > beyond the scope of the discussion I'd like to have with this > question. Thanks.) What do you mean by large? * lots of files * large files * or even large binary files (bad to diff/merge) * long history (i.e. lots of small changes) * impactful history (changes which rewrite nearly everything from scratch) For reference, the linux * has 48414 files, in 3128 directories * the largest file is 1.1M, the whole repo is 600M * no really large binary files * more than 500051 changes/commits including merges * started in 2004 (when git was invented essentially) * the .git folder is 1.4G compared to the 600M files, indicating it may have been rewritting 3 times (well this metric is bogus, there is lots of compression going on in .git) and linux seems to be doing ok with git. So as long as you cannot pinpoint your question on what you are exactly concerned about, there will be no helpful answer I guess. linux is by no means a really large project, there are other projects way larger than that (I am thinking about the KDE project for example) and they do fine as well. Thanks, Stefan -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
What's cooking in git.git (Feb 2015, #05; Thu, 19)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The second and third batch of topics have been merged to 'master'. I am tempted to start discarding topics in the Stalled category that haven't seen much reviews and discussions on for a long time. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * av/wincred-with-at-in-username-fix (2015-01-25) 1 commit (merged to 'next' on 2015-02-16 at 69dd76d) + wincred: fix get credential if username has "@" The credential helper for Windows (in contrib/) used to mishandle a user name with an at-sign in it. * ch/new-gpg-drops-rfc-1991 (2015-01-29) 2 commits (merged to 'next' on 2015-02-16 at e2daf10) + t/lib-gpg: sanity-check that we can actually sign + t/lib-gpg: include separate public keys in keyring.gpg Older GnuPG implementations may not correctly import the keyring material we prepare for the tests to use. * jc/push-cert (2015-02-12) 1 commit (merged to 'next' on 2015-02-16 at f40b3c5) + transport-helper: fix typo in error message when --signed is not supported "git push --signed" gave an incorrectly worded error message when the other side did not support the capability. * jc/remote-set-url-doc (2015-01-29) 1 commit (merged to 'next' on 2015-02-16 at 1f9c342) + Documentation/git-remote.txt: stress that set-url is not for triangular Clarify in the documentation that "remote..pushURL" and "remote..URL" are there to name the same repository accessed via different transports, not two separate repositories. * jk/config-no-ungetc-eof (2015-02-05) 2 commits (merged to 'next' on 2015-02-16 at b7fc890) + config_buf_ungetc: warn when pushing back a random character + config: do not ungetc EOF Reading configuration from a blob object, when it ends with a lone CR, use to confuse the configuration parser. * jk/decimal-width-for-uintmax (2015-02-05) 1 commit (merged to 'next' on 2015-02-16 at e608239) + decimal_width: avoid integer overflow We didn't format an integer that wouldn't fit in "int" but in "uintmax_t" correctly. * jk/pack-bitmap (2015-02-04) 1 commit (merged to 'next' on 2015-02-16 at 2e30424) + ewah: fix building with gcc < 3.4.0 The pack bitmap support did not build with older versions of GCC. * ye/http-accept-language (2015-01-28) 1 commit (merged to 'next' on 2015-02-16 at 10ed819) + http: add Accept-Language header if possible Using environment variable LANGUAGE and friends on the client side, HTTP-based transports now send Accept-Language when making requests. -- [New Topics] * ak/git-pm-typofix (2015-02-18) 1 commit - Git.pm: two minor typo fixes Will merge to 'next'. * jc/decorate-leaky-separator-color (2015-02-18) 1 commit - log --decorate: do not leak "commit" color into the next item "git log --decorate" did not reset colors correctly around the branch names. Will merge to 'next'. -- [Stalled] * nd/list-files (2015-02-09) 21 commits . t3080: tests for git-list-files . list-files: -M aka diff-cached . list-files -F: show submodules with the new indicator '&' . list-files: add -F/--classify . list-files: show directories as well as files . list-files: do not show duplicate cached entries . list-files: sort output and remove duplicates . list-files: add -t back . list-files: add -1 short for --no-column . list-files: add -R/--recursive short for --max-depth=-1 . list-files: -u does not imply showing stages . list-files: make alias 'ls' default to 'list-files' . list-files: a user friendly version of ls-files and more . ls-files: support --max-depth . ls-files: add --column . ls-files: add --color to highlight file names . ls-files: buffer full item in strbuf before printing . ls_colors.c: highlight submodules like directories . ls_colors.c: add a function to color a file name . ls_colors.c: parse color.ls.* from config file . ls_colors.c: add $LS_COLORS parsing code A new "git list-files" Porcelain command, "ls-files" with bells and whistles. No comments? No reviews? No interests? * nd/untracked-cache (2015-02-09) 24 commits - git-status.txt: advertisement for untracked cache - untracked cache: guard and disable on system changes - mingw32: add uname() - t7063: tests for untracked cache - update-index: test the system before enabling untracked cache - update-index: manually enable or disable untracked cache - status: enable untracked cache - untracked-cache: temporarily disable with $GIT_DISABLE_UNTRACKED_CACHE - untracked cache: mark index dirty if untracked cache is updated - untracked cache: print stats with $GIT_TRACE_UNTRACKED_STATS - untracked ca
Git Scaling: What factors most affect Git performance for a large repo?
I posted this to comp.version-control.git.user and didn't get any response. I think the question is plumbing-related enough that I can ask it here. I'm evaluating the feasibility of moving my team from SVN to git. We have a very large repo. [1] We will have a central repo using GitLab (or similar) that everybody works with. Forks, code sharing, pull requests etc. will be done through this central server. By 'performance', I guess I mean speed of day to day operations for devs. * (Obviously, trivially, a (non-local) clone will be slow with a large repo.) * Will a few simultaneous clones from the central server also slow down other concurrent operations for other users? * Will 'git pull' be slow? * 'git push'? * 'git commit'? (It is listed as slow in reference [3].) * 'git stautus'? (Slow again in reference 3 though I don't see it.) * Some operations might not seem to be day-to-day but if they are called frequently by the web front-end to GitLab/Stash/GitHub etc then they can become bottlenecks. (e.g. 'git branch --contains' seems terribly adversely affected by large numbers of branches.) * Others? Assuming I can put lots of resources into a central server with lots of CPU, RAM, fast SSD, fast networking, what aspects of the repo are most likely to affect devs' experience? * Number of commits * Sheer disk space occupied by the repo * Number of tags. * Number of branches. * Binary objects in the repo that cause it to bloat in size [1] * Other factors? Of the various HW items listed above --CPU speed, number of cores, RAM, SSD, networking-- which is most critical here? (Stash recommends 1.5 x repo_size x number of concurrent clones of available RAM. I assume that is good advice in general.) Assume ridiculous numbers. Let me exaggerate: say 1 million commits, 15 GB repo, 50k tags, 1,000 branches. (Due to historical code fixups, another 5,000 "fix-up branches" which are just one little dangling commit required to change the code a little bit between a commit a tag that was not quite made from it.) While there's lots of information online, much of it is old [3] and with git constantly evolving I don't know how valid it still is. Then there's anecdotal evidence that is of questionable value.[2] Are many/all of the issues Facebook identified [3] resolved? (Yes, I understand Facebook went with Mercurial. But I imagine the git team nevertheless took their analysis to heart.) Thanks, Steve [1] (Yes, I'm investigating ways to make our repo not so large etc. That's beyond the scope of the discussion I'd like to have with this question. Thanks.) [2] The large amounts of anecdotal evidence relate to the "why don't you try it yourself?" response to my question. I will I I have to but setting up a properly methodical study is time consuming and difficult --I don't want to produce poor anecdotal numbers that don't really hold up-- and if somebody's already done the work, then I should leverage it. [3] http://thread.gmane.org/gmane.comp.version-control.git/189776 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v3] remote-curl: fall back to Basic auth if Negotiate fails
On Wed, Feb 18, 2015 at 04:17:46PM +, Dan Langille (dalangil) wrote: > I just built from ‘master’, on FreeBSD 9.3: > > cd ~/src > git clone https://github.com/git/git.git > cd git > gmake > > Then tried ~/src/git/git clone https://OUR_REPO > > It cores too, and I see: git-remote-https.core Can you compile with debugging symbols and provide a backtrace? I'm not seeing any such behavior on my end, and I'm not sure whether it's my patch or something else that might be present in master. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: odb_mkstemp's 0444 permission broke write/delete access on AFP
On Tue, Feb 17, 2015 at 09:51:38AM +0100, Matthieu Moy wrote: > This should be fixable from Git itself, by replacing the calls to > "unlink" with something like > > int unlink_or_chmod(...) { > if (unlink(...)) { > chmod(...); // give user write permission > return unlink(...); > } > } > > This does not add extra cost in the normal case, and would fix this > particular issue for afp shares. So, I think that would fix the biggest > problem for afp-share users without disturbing others. It seems > reasonable to me to do that unconditionnally. This can have security issues if you're trying to unlink a symlink, as chmod will dereference the symlink but unlink will not. Giving the file owner write permission may not be sufficient, as the user may be a member of a group with write access to the repo. A malicious user who also has access to the repo could force the current user to chmod an arbitrary file such that it had looser permissions. I've seen a case where Perl's ExtUtils::MakeMaker chmoded /etc/mime.types 0666 as a result of this. I don't think there's a secure way to implement this unless you're on an OS with lchmod or fchmodat that supports AT_SYMLINK_NOFOLLOW. Linux is not one of those systems. -- brian m. carlson / brian with sandals: Houston, Texas, US +1 832 623 2791 | http://www.crustytoothpaste.net/~bmc | My opinion only OpenPGP: RSA v4 4096b: 88AC E9B2 9196 305B A994 7552 F1BA 225C 0223 B187 signature.asc Description: Digital signature
Re: [PATCH 1/3] connect.c: Improve parsing of literal IPV6 addresses
On Thu, Feb 19, 2015 at 09:54:52AM -0800, Junio C Hamano wrote: I can see that you do not agree with the "If we accept it" part (where "it" refers to "allowing [...] was a bug.")---past acceptance was not a bug for you. Brian is for that "If we accept it", and sees it as a bug. So let's see what he comes up with as a follow-up to the "we should explicitly document it" part. Here's what I propose: -- >8 -- Subject: [PATCH] Documentation: note deprecated syntax for IPv6 SSH URLs We have historically accepted some invalid syntax for SSH URLs containing IPv6 literals. Older versions of Git accepted URLs missing the brackets required by RFC 2732. Note that this behavior is deprecated and that other protocol handlers will not accept this syntax. Signed-off-by: brian m. carlson --- Documentation/urls.txt | 4 1 file changed, 4 insertions(+) diff --git a/Documentation/urls.txt b/Documentation/urls.txt index 9ccb246..2c1a84f 100644 --- a/Documentation/urls.txt +++ b/Documentation/urls.txt @@ -38,6 +38,10 @@ The ssh and git protocols additionally support ~username expansion: - git://host.xz{startsb}:port{endsb}/~{startsb}user{endsb}/path/to/repo.git/ - {startsb}user@{endsb}host.xz:/~{startsb}user{endsb}/path/to/repo.git/ +For backwards compatibility reasons, Git, when using ssh URLs, accepts +some URLs containing IPv6 literals that are missing the brackets. This +syntax is deprecated, and other protocol handlers do not permit this. + For local repositories, also supported by Git natively, the following syntaxes may be used: -- 2.2.1.209.g41e5f3a -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interested in helping open source friends on HP-UX?
On Thu, 19 Feb 2015 14:21:11 +0100, Michael J Gruber wrote: > Jeff, you got it wrong. You should do the hard part and leave the easy > part to us! > > Thanks anyways, I'll add this to my HP_UX branch. I did not mention this in earlier mails. When using the HP C-ANSI-C compiler, MAX_INT is not set. I had to add --8<--- #ifndef SIZE_MAX # define SIZE_MAX (18446744073709551615UL) /* define SIZE_MAX (4294967295U) */ # endif -->8--- to these files sha1_file.c utf8.c walker.c wrapper.c And yes, that could be dynamic and probably be in another header file -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.21 porting perl5 on HP-UX, AIX, and openSUSE http://mirrors.develooper.com/hpux/http://www.test-smoke.org/ http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/ pgpzl0ap1Jy8W.pgp Description: OpenPGP digital signature
[PATCH v3] submodule: Improve documentation of update subcommand
The documentation of 'git submodule update' has several problems: 1) It mentions that value 'none' of submodule.$name.update can be overridden by --checkout, but other combinations of configuration values and command line options are not mentioned. 2) The documentation of submodule.$name.update is scattered across three places, which is confusing. 3) The documentation of submodule.$name.update in gitmodules.txt is incorrect, because the code always uses the value from .git/config and never from .gitmodules. 4) Documentation of --force was incomplete, because it is only effective in case of checkout method of update. This patch fixes all these problems. Now, submodule.$name.update is fully documented in git-submodule.txt and the other files just refer to it. This is based on discussion between Junio C Hamano, Jens Lehmann and myself. Signed-off-by: Michal Sojka --- Documentation/config.txt| 15 +++ Documentation/git-submodule.txt | 58 + Documentation/gitmodules.txt| 18 + 3 files changed, 57 insertions(+), 34 deletions(-) diff --git a/Documentation/config.txt b/Documentation/config.txt index ae6791d..fb2ae37 100644 --- a/Documentation/config.txt +++ b/Documentation/config.txt @@ -2411,12 +2411,17 @@ status.submodulesummary:: submodule..path:: submodule..url:: + The path within this project and URL for a submodule. These + variables are initially populated by 'git submodule init'; + edit them to override the URL and other values found in the + `.gitmodules` file. See linkgit:git-submodule[1] and + linkgit:gitmodules[5] for details. + submodule..update:: - The path within this project, URL, and the updating strategy - for a submodule. These variables are initially populated - by 'git submodule init'; edit them to override the - URL and other values found in the `.gitmodules` file. See - linkgit:git-submodule[1] and linkgit:gitmodules[5] for details. + The default updating strategy for a submodule. This variable + is populated by `git submodule init` from the + linkgit:gitmodules[5] file. See description of 'update' + command in linkgit:git-submodule[1]. submodule..branch:: The remote branch name for a submodule, used by `git submodule diff --git a/Documentation/git-submodule.txt b/Documentation/git-submodule.txt index 8e6af65..72c6fb2 100644 --- a/Documentation/git-submodule.txt +++ b/Documentation/git-submodule.txt @@ -154,14 +154,36 @@ If `--force` is specified, the submodule's work tree will be removed even if it contains local modifications. update:: - Update the registered submodules, i.e. clone missing submodules and - checkout the commit specified in the index of the containing repository. - This will make the submodules HEAD be detached unless `--rebase` or - `--merge` is specified or the key `submodule.$name.update` is set to - `rebase`, `merge` or `none`. `none` can be overridden by specifying - `--checkout`. Setting the key `submodule.$name.update` to `!command` - will cause `command` to be run. `command` can be any arbitrary shell - command that takes a single argument, namely the sha1 to update to. + Update the registered submodules to match what the superproject + expects by cloning missing submodules and updating the working + tree of the submodules. The "updating" can be done in several + ways depending on command line options and the value of + `submodule..update` in .git/config: + + checkout;; the new commit recorded in the superproject will be + checked out in the submodule on a detached HEAD. This is + done when `--checkout` option is given, or no option is + given, and `submodule..update` is unset, or if it is set + to 'checkout'. + + rebase;; the current branch of the submodule will be rebased + onto the commit recoded in the superproject. This is done + when `--rebase` option is given, or no option is given, and + `submodule..update` is set to 'rebase'. + + merge;; the commit recorded in the superproject will be merged + into the current branch in the submodule. This is done + when `--merge` option is given, or no option is given, and + `submodule..update` is set to 'merge'. + + custom command;; arbitrary shell command that takes a single + argument (the sha1 of the commit recorded in the + superproject) is executed. This is done when no option is + given, and `submodule..update` has the form of + '!command'. ++ +When no option is given and `submodule..update` is set to 'none', +the submodule is not updated. + If the submodule is not yet initialized, and you just want to use the setting as stored in .gitmodules, you can automatically initializ
Re: Git Feature Request - show current branch
Michael J Gruber writes: > Randall S. Becker venit, vidit, dixit 19.02.2015 14:32: >> git symbolic-ref --short HEAD > > That errors out when HEAD is detached. Isn't that what you would want to happen anyway? if current=$(that command) then you know $current is checked out else you know HEAD is detached fi If you used another command that gives either the name of the current branch or 4-letter H-E-A-D without any other indication, you cannot tell if you checked out the "HEAD" branch aka refs/heads/HEAD or you are not on any branch. The former would happen after doing this: $ git update-ref refs/heads/HEAD HEAD $ git checkout HEAD Of course, this is not a recommended practice, and "git branch" these days refuses to create refs/heads/HEAD to discourage you from doing so by mistake, but there is no guarantee that the repository whatever script you are writing to work in was created and used by sane people ;-) so you would want to be defensive, no? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] log --decorate: do not leak "commit" color into the next item
Jeff King writes: > Yeah, I think this is a good fix. I had a vague feeling that we may have > done this on purpose to let the decoration color "inherit" from the > existing colors for backwards compatibility, but I don't think that > could ever have worked (since color.decorate.* never defaulted to > "normal"). Hmph, but that $gmane/191118 talks about giving bold to commit-color and then expecting for decors to inherit the boldness, a wish I can understand. But I do not necessarily agree with it---it relies on that after "(" and ", " there is no reset, which is not how everything else works. So this change at least needs to come with an explanation to people who are used to and took advantage of this color attribute leakage, definitely in the log message and preferrably to the documentation that covers all the color.*. settings, I think. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] connect.c: Improve parsing of literal IPV6 addresses
Torsten Bögershausen writes: > On 02/18/2015 07:40 PM, Junio C Hamano wrote: >> "brian m. carlson" writes: >> >>> I understand that this used to work, but it probably shouldn't have >>> ever been accepted. It's nonstandard, and if we accept it for ssh, >>> people will want it to work for https, and due to libcurl, it simply >>> won't. >>> >>> I prefer to see our past acceptance of this format as a bug. This is >>> the first that I've heard of anyone noticing this (since 2013), so it >>> can't be in common usage. >>> >>> If we accept it, we should explicitly document it as being deprecated >>> and note that it's inconsistent with the way everything else works. >> I was reviewing my Undecided pile today, and I think your objection >> makes sense. >> >> Either of you care to update documentation, please, before I drop >> this series and forget about it? > > The URL RFC is much stricter regarding which characters that are allowed > in which part of the URL, as least as I read it. > ... > I'm somewhat unsure what to write in the documentation, I must admit. I can see that you do not agree with the "If we accept it" part (where "it" refers to "allowing [...] was a bug.")---past acceptance was not a bug for you. Brian is for that "If we accept it", and sees it as a bug. So let's see what he comes up with as a follow-up to the "we should explicitly document it" part. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH v2] submodule: Fix documentation of update subcommand
On Thu, Feb 19 2015, Junio C Hamano wrote: > Michal Sojka writes: > >> The documentation of 'git submodule update' has several problems: >> >> 1) It says that submodule.$name.update can be overridden by --checkout >>only if its value is `none`. > > Hmm, I do not read the existing sentence that way, though. The > "only if" above is only in your head and not in the documentation, > no? Yes, you're right. > The way I understand it is that the explanation does not even bother > to say that it is overridable when update is set to something that > clearly corresponds to --option (e.g. 'update=rebase' is for people > too lazy to type --rebase from the command line), but because it is > unclear when it is set to 'update=none', it specifically singles out > that case. I updated the commit message a bit. >> diff --git a/Documentation/config.txt b/Documentation/config.txt >> index ae6791d..f30cbbc 100644 >> --- a/Documentation/config.txt >> +++ b/Documentation/config.txt >> @@ -2411,12 +2411,29 @@ status.submodulesummary:: >> >> submodule..path:: >> submodule..url:: >> +The path within this project and URL for a submodule. These >> +variables are initially populated by 'git submodule init'; >> +edit them to override the URL and other values found in the >> +`.gitmodules` file. See linkgit:git-submodule[1] and >> +linkgit:gitmodules[5] for details. >> + > > OK. > >> submodule..update:: >> -The path within this project, URL, and the updating strategy >> -for a submodule. These variables are initially populated >> -by 'git submodule init'; edit them to override the >> -URL and other values found in the `.gitmodules` file. See >> -linkgit:git-submodule[1] and linkgit:gitmodules[5] for details. >> +The default updating strategy for a submodule, used by `git >> +submodule update`. This variable is populated by `git >> +submodule init` from linkgit:gitmodules[5]. >> + >> +If the value is 'checkout' (the default), the new commit >> +specified in the superproject will be checked out in the > > Have you formatted this? I _think_ this change would break the > typesetting by having an empty line there. Right. I need to add a '+' and deindent. >> +submodule on a detached HEAD. >> +If 'rebase', the current branch of the submodule will be >> +rebased onto the commit specified in the superproject. >> +If 'merge', the commit specified in the superproject will be >> +merged into the current branch in the submodule. If 'none', >> +the submodule with name `$name` will not be updated by >> +default. >> +If the value is of form '!command', it will cause `command` to >> +be run. `command` can be any arbitrary shell command that >> +takes a single argument, namely the sha1 to update to. > > I have a feeling that it is better to leave the explanations of > these values in git-submodule.txt (i.e. where you took the above > text from) and say "see description of 'update' command in > linkgit:git-submodule[1]" here to avoid duplication. OK >> submodule..branch:: >> The remote branch name for a submodule, used by `git submodule >> diff --git a/Documentation/git-submodule.txt >> b/Documentation/git-submodule.txt >> index 8e6af65..c92908e 100644 >> --- a/Documentation/git-submodule.txt >> +++ b/Documentation/git-submodule.txt >> @@ -154,14 +154,13 @@ If `--force` is specified, the submodule's work tree >> will be removed even if >> it contains local modifications. >> >> update:: >> -Update the registered submodules, i.e. clone missing submodules and >> -checkout the commit specified in the index of the containing repository. >> -This will make the submodules HEAD be detached unless `--rebase` or >> -`--merge` is specified or the key `submodule.$name.update` is set to >> -`rebase`, `merge` or `none`. `none` can be overridden by specifying >> -`--checkout`. Setting the key `submodule.$name.update` to `!command` >> -will cause `command` to be run. `command` can be any arbitrary shell >> -command that takes a single argument, namely the sha1 to update to. >> +Update the registered submodules to match what the superproject >> +expects by cloning missing submodules and updating the working >> +tree of the submodules > > This part is better than the original. Indeed. You wrote this in a previous email :) >> The "updating" can take various forms >> +and can be configured in .git/config by the >> +`submodule.$name.update` key or by explicitely giving one of >> +'--checkout' (the default), '--merge' or '--rebase' options. See >> +linkgit:git-config[1] for details. > > Because submodule..update is interesting only to those who run > "git submodule update", and also the command line options that > interact with the setting are only described here not in config.txt, > I think it is better to have the description of various modes here. > > And the description, if it i
Re: [RFD/PATCH] stash: introduce checkpoint mode
"Kyle J. McKay" writes: > What about a shortcut to "reset-and-apply" as well? > > I have often been frustrated when "git stash apply" refuses to work > because I have changes that would be stepped on and there's no --force > option like git checkout has. I end up doing a reset just so I can > run stash apply. Doesn't that cut both ways, though? A single step short-cut, done in any way other than a more explicit way such as "git reset --hard && git stash apply" (e.g. "git stash reset-and-apply" or "git stash apply --force") that makes it crystal clear that the user _is_ discarding, has a risk of encouraging users to form a dangerous habit of invoking the short-cut without thinking and leading to "oops, I didn't mean that!". -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3] connect.c: Improve parsing of literal IPV6 addresses
On 02/18/2015 07:40 PM, Junio C Hamano wrote: > "brian m. carlson" writes: > >> On Thu, Jan 22, 2015 at 11:05:29PM +0100, Torsten Bögershausen wrote: >>> We want to support ssh://bmc@2001:470:1f05:79::1/git/bmc/homedir.git/ >>> because e.g. the Git shipped with Debian (1.7.10.4) (and a lot of >>> other installations) supports it. >> I understand that this used to work, but it probably shouldn't have >> ever been accepted. It's nonstandard, and if we accept it for ssh, >> people will want it to work for https, and due to libcurl, it simply >> won't. >> >> I prefer to see our past acceptance of this format as a bug. This is >> the first that I've heard of anyone noticing this (since 2013), so it >> can't be in common usage. >> >> If we accept it, we should explicitly document it as being deprecated >> and note that it's inconsistent with the way everything else works. > I was reviewing my Undecided pile today, and I think your objection > makes sense. > > Either of you care to update documentation, please, before I drop > this series and forget about it? The URL RFC is much stricter regarding which characters that are allowed in which part of the URL, as least as I read it. The "problem" started when /usr/bin/ssh excepted things like /usr/bin/ssh fe80:x:y:z%eth0 and Git simply passed the hostname to ssh. And when the [] was there, it was stripped because ssh doesn't like them. URLs like ssh://bmc@2001:470:1f05:79::1/git/bmc/homedir.git/ simply worked, and nobody ever complained about this, (until now), Git never rejected IPV6 URLs without [], please correct me if I'm wrong. Git never cared about the exact URL, so that IPV6 URL's without [] where allowed from "day one". On top of that, we support the short form, user@host:~ or other variants. But we never claimed to be compatible to RFC 1738, even if it makes sense to do so. What exactly should we write in the documentation ? Git supports RFC1738 (but is not as strict in parsing the URL, because we assume that the host name resolver will do some checking for us. Git currently does not support user@[fe80::x:y:z], even if RFC suggests it Git never claimed to be 100% compatible to RFC 1738, and will probably never be, (as we have old code that is as it is). We (at least I) don't want to break existing repos, rejecting URL's that had been working before and stopped working because the Git version is updated or so) This patch series is attempting to be backwards compatible to what old, older. and oldest versions of Git accepted. At the price that we accept URL's which do not conform to the RFC are accepted. It fixes the long standing issue that user@[fe80:] did not work. I'm somewhat unsure what to write in the documentation, I must admit. Unfortunately URL parsing is a tricky thing, this patch tries to do improvements. Especially it adds test cases, which are good to prevent further breakage. Updating the documentation was never part of the patch series, and if the documentation is updated, this is done in a separate commit anyway. How much does this series qualify for the "we didn't update the docs", but fixed the code, let's drop it ? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Git Feature Request - show current branch
Randall S. Becker venit, vidit, dixit 19.02.2015 14:32: > git symbolic-ref --short HEAD That errors out when HEAD is detached. git rev-parse --symbolic-full-name [--abbrev-ref] HEAD returns the branch name or HEAD. Though it's a bit difficult to discover. I guess git 3.0 will have "git branch" and "git branches" :) Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Strange reachability inconsistency (apparently, at least...)
I have a (fsck-clean) git tree in which for 2 commits A and B: * "git merge-base --is-ancestor A B" returns 0 * "git log B..A" returns a non-empty set of commits I get this behaviour with 2.3.0 as well as with 2.1.3 and 1.7.12. Is that a real bug or am I just misinterpreting something ? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git blame swallows up lines in case of mixed line endings
Hi Folks, I encounter unexpected behavior in the following case: file content: line1 line2 line3 line4 This is what I get as console output (on Windows): > git blame -s file.txt 7db36436 1) line1 line3436 2) line2 7db36436 3) line4 This is the real content: > git blame -s file.txt > blame.txt blame.txt opened in Notepad++: 7db36436 1) line1 7db36436 2) line2 line3 7db36436 3) line4 Admittedly, very stupid editors, such as Windows Notepad, cannot handle mixed line endings as well. But is this also the way git blame should behave? Kind regards Konstantin -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFD/PATCH] stash: introduce checkpoint mode
On Feb 19, 2015, at 04:34, Michael J Gruber wrote: "git stash save" performs the steps "create-store-reset". Often, users try to use "stash save" as a way to to save their current state (index, worktree) before an operation like "checkout/reset --patch" they don't feel confident about, and are forced to do "git stash save && git stash apply". Provide an extra mode that does "create-store" only without the reset, so that one can "ceckpoint" the sate and keep working on it. s/sate/state/ Suggested-by: "Kyle J. McKay" Signed-off-by: Michael J Gruber --- Notes: I'm not sure about how to best expose this mode: git stash checkpoint git stash save --checkpoint Maybe it is best to document the former and rename "--checkpoint" to "--no-reset"? Once the user figures out that "save" is really "save-and-reset" I think "--no-reset" makes more sense. It certainly seems more discoverable via an explicit "checkpoint" command though, but that's really just an alias so maybe it's better left up to the user to make one. There would need to be some updated docs (git-stash.txt) to go with the change... Also, a "safe return" to a checkpoint probably requires git reset --hard && git stash pop although "git stash pop" will do in many cases. Should we provide a shortcut "restore" which does the reset-and-pop? What about a shortcut to "reset-and-apply" as well? I have often been frustrated when "git stash apply" refuses to work because I have changes that would be stepped on and there's no --force option like git checkout has. I end up doing a reset just so I can run stash apply. What about if git stash apply/pop grokked a --force option? That would seem to eliminate the need for a "reset-and-pop"/"reset-and- apply" shortcut while also being useful to non-checkpoint stashes as well. git-stash.sh | 13 + 1 file changed, 13 insertions(+) diff --git a/git-stash.sh b/git-stash.sh index d4cf818..42f140c 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -193,12 +193,16 @@ store_stash () { } save_stash () { + checkpoint= keep_index= patch_mode= untracked= while test $# != 0 do case "$1" in + -c|--checkpoint) + checkpoint=t + ;; -k|--keep-index) keep_index=t ;; @@ -267,6 +271,11 @@ save_stash () { die "$(gettext "Cannot save the current status")" say Saved working directory and index state "$stash_msg" + if test -n "$checkpoint" + then + exit 0 + fi + if test -z "$patch_mode" then git reset --hard ${GIT_QUIET:+-q} @@ -576,6 +585,10 @@ save) shift save_stash "$@" ;; +checkpoint) + shift + save_stash "--checkpoint" "$@" + ;; apply) shift apply_stash "$@" -- Otherwise this looks good. A very small change to add the functionality. -Kyle -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: Git Feature Request - show current branch
Hi Martin, I use: git symbolic-ref --short HEAD in scripts. Not sure it's the best way, but it works 100% for me. Regards, Randall -Original Message- From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On Behalf Of mdc...@seznam.cz Sent: February 19, 2015 8:15 AM To: git@vger.kernel.org Subject: Git Feature Request - show current branch Hello, To start with, I did not find an official way to submit feature request so hopefully this is the right way to do so - if not then my apologize & appreciate if somebody could re-submit to the proper place. I'd like to request adding a parameter to 'git branch' that would only show the current branch (w/o the star) - i.e. the outcome should only be the name of the branch that is normally marked with the star when I do 'git branch' command. This may be very helpful in some external scripts that just simply need to know the name of the current branch. I know there are multiple ways to do this today (some described here: http://stackoverflow.com/questions/6245570/how-to-get-current-branch-name-in-git) but I really think that adding simple argument to 'git branch' would be very useful instead of forcing people to use 'workarounds'. My suggestion is is to name the parameter '--current' or '--show-current'. Example: Command: git branch Outcome: branchA branchB * master Command: git branch --current Outcome: master Thank you, Martin -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interested in helping open source friends on HP-UX?
Jeff King venit, vidit, dixit 19.02.2015 13:54: > On Thu, Feb 19, 2015 at 12:20:02PM +0100, Michael J Gruber wrote: > >> OK, so we should use NO_ICONV on HP_UX then. >> Failing so many tests with NO_ICONV is certainly not ideal, but I'm not sure we should care to protect so many tests with a prerequisite. >>> >>> How feasible is it to isolate those tests into separate test files that >>> people that know to not use e.g. Asian can safely ignore them? >> >> We have the prerequisite mechanism for that, and most probably, the >> tests are "isolated" already, in the sense that with NO_ICONV, only >> trivial setup tests succeed for those test files but all "proper" tests >> fail. But I'll check. Need a good test to set the prerequisite, though. > > I took a first pass at this. The results are below (and I am hoping one > of you can use it as a base to build on, as I do not want to commit to > doing the second half, as you will see :) ). > > It passes NO_ICONV through to the test suite, sets up a prerequisite, > disables some test scripts which are purely about i18n (e.g., > t3900-i18n-commit), and marks some of the scripts with one-off tests > using the ICONV prereq. Hmm. I know we pass other stuff down, but is this really a good idea? It relies on the fact that the git that we test was built with the options from there. This assumptions breaks (with) GIT_TEST_INSTALLED, if not more. Basically, it may break as soon as we run the tests by other means than "make", which is quite customary if you run single tests. (And we do pass config.mak down, me thinks, but NO_ICONV may come from the command line.) > Note that it also has some code changes around reencode_string_len. > These aren't strictly necessary, but they silence gcc warnings when > compiled with NO_ICONV. In that case we do: > > #define reencode_string_len(a,b,c,d,e) NULL > > but "e" is an out-parameter. We don't promise it is valid if the > function returns NULL (which it does here). I'm kind of surprised the > compiler doesn't realize that: > > foo = reencode_string_len(...); > if (foo) > bar(); > > is dead code, since the first line becomes "foo = NULL". So that's > optional. > > So, on to the tricky parts. Here are the failures that remain: > > 1. The script builds up a commit history through the script, and later > tests depend on this for things like commit timestamps or the exact > shape of history. t9350 is an example of this (it has one failing > test which can be marked, but then other tests later fail in > confusing ways). > > 2. The script creates commits with encoded commit messages, then uses > those both for cases that care about the encoding, and those that > do not. t4041 is an example here. I think it would be best to use > vanilla commit mesages for the main body of tests, and then > explicitly test the encoding-related features separately. I think > t4205 and t6006 are in this boat, too. > > I also tested this on a system with a working "iconv". If we are > building with NO_ICONV, I am tempted to say that there should be no need > to run the "iconv" command-line program at all. But t6006, for example, > does it a lot outside of any test_expect_*. Probably it should be: > > test_lazy_prereq ICONV ' > test -z "$NO_ICONV" && > utf8_o=$(printf "\303\263") && > latin1_o=$(printf "\363") && > test "$(echo $utf8_o | iconv -f UTF-8 -t ISO-8559-1)" = "$latin1_o" > ' > > or something, and all of that setup should be wrapped in a > "test_expect_success ICONV ...". Of course that is the easy part. The > hard part is splitting the ICONV setup from the vanilla commit setup so > that the other tests can run. Jeff, you got it wrong. You should do the hard part and leave the easy part to us! Thanks anyways, I'll add this to my HP_UX branch. > --- > diff --git a/Makefile b/Makefile > index e8ce649..c460ce8 100644 > --- a/Makefile > +++ b/Makefile > @@ -2112,6 +2112,7 @@ endif > ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT > @echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@ > endif > + @echo NO_ICONV=\''$(subst ','\'',$(subst ','\'',$(NO_ICONV)))'\' >>$@ > @echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' > >>$@ > @echo GETTEXT_POISON=\''$(subst ','\'',$(subst > ','\'',$(GETTEXT_POISON)))'\' >>$@ > ifdef GIT_PERF_REPEAT_COUNT > diff --git a/pretty.c b/pretty.c > index 9d34d02..74fe5fb 100644 > --- a/pretty.c > +++ b/pretty.c > @@ -1497,7 +1497,7 @@ void format_commit_message(const struct commit *commit, > } > > if (output_enc) { > - int outsz; > + int outsz = 0; > char *out = reencode_string_len(sb->buf, sb->len, > output_enc, utf8, &outsz); > if (out) > diff --git a/strbuf.c b/strbuf.c > index 88cafd4..6d8ad4b 100644 > --- a/strbuf.c > +++ b/strbuf.c > @@ -94,7 +94,7 @@ void strbuf_ltr
Git Feature Request - show current branch
Hello, To start with, I did not find an official way to submit feature request so hopefully this is the right way to do so - if not then my apologize & appreciate if somebody could re-submit to the proper place. I'd like to request adding a parameter to 'git branch' that would only show the current branch (w/o the star) - i.e. the outcome should only be the name of the branch that is normally marked with the star when I do 'git branch' command. This may be very helpful in some external scripts that just simply need to know the name of the current branch. I know there are multiple ways to do this today (some described here: http://stackoverflow.com/questions/6245570/how-to-get-current-branch-name-in-git) but I really think that adding simple argument to 'git branch' would be very useful instead of forcing people to use 'workarounds'. My suggestion is is to name the parameter '--current' or '--show-current'. Example: Command: git branch Outcome: branchA branchB * master Command: git branch --current Outcome: master Thank you, Martin -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interested in helping open source friends on HP-UX?
On Thu, Feb 19, 2015 at 12:20:02PM +0100, Michael J Gruber wrote: > OK, so we should use NO_ICONV on HP_UX then. > > >> Failing so many tests with NO_ICONV is certainly not ideal, but I'm not > >> sure we should care to protect so many tests with a prerequisite. > > > > How feasible is it to isolate those tests into separate test files that > > people that know to not use e.g. Asian can safely ignore them? > > We have the prerequisite mechanism for that, and most probably, the > tests are "isolated" already, in the sense that with NO_ICONV, only > trivial setup tests succeed for those test files but all "proper" tests > fail. But I'll check. Need a good test to set the prerequisite, though. I took a first pass at this. The results are below (and I am hoping one of you can use it as a base to build on, as I do not want to commit to doing the second half, as you will see :) ). It passes NO_ICONV through to the test suite, sets up a prerequisite, disables some test scripts which are purely about i18n (e.g., t3900-i18n-commit), and marks some of the scripts with one-off tests using the ICONV prereq. Note that it also has some code changes around reencode_string_len. These aren't strictly necessary, but they silence gcc warnings when compiled with NO_ICONV. In that case we do: #define reencode_string_len(a,b,c,d,e) NULL but "e" is an out-parameter. We don't promise it is valid if the function returns NULL (which it does here). I'm kind of surprised the compiler doesn't realize that: foo = reencode_string_len(...); if (foo) bar(); is dead code, since the first line becomes "foo = NULL". So that's optional. So, on to the tricky parts. Here are the failures that remain: 1. The script builds up a commit history through the script, and later tests depend on this for things like commit timestamps or the exact shape of history. t9350 is an example of this (it has one failing test which can be marked, but then other tests later fail in confusing ways). 2. The script creates commits with encoded commit messages, then uses those both for cases that care about the encoding, and those that do not. t4041 is an example here. I think it would be best to use vanilla commit mesages for the main body of tests, and then explicitly test the encoding-related features separately. I think t4205 and t6006 are in this boat, too. I also tested this on a system with a working "iconv". If we are building with NO_ICONV, I am tempted to say that there should be no need to run the "iconv" command-line program at all. But t6006, for example, does it a lot outside of any test_expect_*. Probably it should be: test_lazy_prereq ICONV ' test -z "$NO_ICONV" && utf8_o=$(printf "\303\263") && latin1_o=$(printf "\363") && test "$(echo $utf8_o | iconv -f UTF-8 -t ISO-8559-1)" = "$latin1_o" ' or something, and all of that setup should be wrapped in a "test_expect_success ICONV ...". Of course that is the easy part. The hard part is splitting the ICONV setup from the vanilla commit setup so that the other tests can run. --- diff --git a/Makefile b/Makefile index e8ce649..c460ce8 100644 --- a/Makefile +++ b/Makefile @@ -2112,6 +2112,7 @@ endif ifdef GIT_TEST_CMP_USE_COPIED_CONTEXT @echo GIT_TEST_CMP_USE_COPIED_CONTEXT=YesPlease >>$@ endif + @echo NO_ICONV=\''$(subst ','\'',$(subst ','\'',$(NO_ICONV)))'\' >>$@ @echo NO_GETTEXT=\''$(subst ','\'',$(subst ','\'',$(NO_GETTEXT)))'\' >>$@ @echo GETTEXT_POISON=\''$(subst ','\'',$(subst ','\'',$(GETTEXT_POISON)))'\' >>$@ ifdef GIT_PERF_REPEAT_COUNT diff --git a/pretty.c b/pretty.c index 9d34d02..74fe5fb 100644 --- a/pretty.c +++ b/pretty.c @@ -1497,7 +1497,7 @@ void format_commit_message(const struct commit *commit, } if (output_enc) { - int outsz; + int outsz = 0; char *out = reencode_string_len(sb->buf, sb->len, output_enc, utf8, &outsz); if (out) diff --git a/strbuf.c b/strbuf.c index 88cafd4..6d8ad4b 100644 --- a/strbuf.c +++ b/strbuf.c @@ -94,7 +94,7 @@ void strbuf_ltrim(struct strbuf *sb) int strbuf_reencode(struct strbuf *sb, const char *from, const char *to) { char *out; - int len; + int len = 0; if (same_encoding(from, to)) return 0; diff --git a/t/t3900-i18n-commit.sh b/t/t3900-i18n-commit.sh index 4bf1dbe..d522677 100755 --- a/t/t3900-i18n-commit.sh +++ b/t/t3900-i18n-commit.sh @@ -7,6 +7,11 @@ test_description='commit and log output encodings' . ./test-lib.sh +if ! test_have_prereq ICONV; then + skip_all='skipping i18n tests, iconv not available' + test_done +fi + compare_with () { git show -s $1 | sed -e '1,/^$/d' -e 's/^//' >current && case "$3" in diff --git a/t/t3901-i18n-patch.sh b/t/t3901-i18n-patch.sh index a392f3d..c4f9d06 100755 --- a/t
[RFD/PATCH] stash: introduce checkpoint mode
"git stash save" performs the steps "create-store-reset". Often, users try to use "stash save" as a way to to save their current state (index, worktree) before an operation like "checkout/reset --patch" they don't feel confident about, and are forced to do "git stash save && git stash apply". Provide an extra mode that does "create-store" only without the reset, so that one can "ceckpoint" the sate and keep working on it. Suggested-by: "Kyle J. McKay" Signed-off-by: Michael J Gruber --- Notes: I'm not sure about how to best expose this mode: git stash checkpoint git stash save --checkpoint Maybe it is best to document the former and rename "--checkpoint" to "--no-reset"? Also, a "safe return" to a checkpoint probably requires git reset --hard && git stash pop although "git stash pop" will do in many cases. Should we provide a shortcut "restore" which does the reset-and-pop? git-stash.sh | 13 + 1 file changed, 13 insertions(+) diff --git a/git-stash.sh b/git-stash.sh index d4cf818..42f140c 100755 --- a/git-stash.sh +++ b/git-stash.sh @@ -193,12 +193,16 @@ store_stash () { } save_stash () { + checkpoint= keep_index= patch_mode= untracked= while test $# != 0 do case "$1" in + -c|--checkpoint) + checkpoint=t + ;; -k|--keep-index) keep_index=t ;; @@ -267,6 +271,11 @@ save_stash () { die "$(gettext "Cannot save the current status")" say Saved working directory and index state "$stash_msg" + if test -n "$checkpoint" + then + exit 0 + fi + if test -z "$patch_mode" then git reset --hard ${GIT_QUIET:+-q} @@ -576,6 +585,10 @@ save) shift save_stash "$@" ;; +checkpoint) + shift + save_stash "--checkpoint" "$@" + ;; apply) shift apply_stash "$@" -- 2.3.0.191.ge77e8b9 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interested in helping open source friends on HP-UX?
H.Merijn Brand venit, vidit, dixit 19.02.2015 12:14: > On Thu, 19 Feb 2015 11:33:01 +0100, Michael J Gruber > wrote: > >> Jeff King venit, vidit, dixit 18.02.2015 19:57: >>> On Wed, Feb 18, 2015 at 10:47:16AM -0800, Junio C Hamano wrote: >>> > It seems like we could use > > (cd src && tar cf - .) | (cd dst && tar xf -) > > here as a more portable alternative. I don't think we can rely on rsync > being everywhere. Thanks; I wasn't even aware that we used rsync in our tests. We certainly do not want to rely on it. >>> >>> I don't think we do. >>> >>> Grepping for rsync in t/, it is mentioned in three places: >>> >>> 1. In t1509, we use it, but that test script does not run unless you >>> set a bunch of environment variables to enable it. >>> >>> 2. In a sample patch for t4100. Obviously this one doesn't execute. :) >>> >>> 3. In t5500, to test "rsync:" protocol supported. This is behind a >>> check that we can run rsync at all (though it does not properly use >>> prereqs or use the normal "skip" procedure). >>> Why not "cp -r src dst", though? >>> >>> I was assuming that the "-P" in the original had some purpose. My "cp >>> -r" does not seem to dereference symlinks, but maybe there is something >>> I am missing. >>> >>> -Peff >> >> There's a symlink in sub that needs to be preserved. >> >> I'm cooking up a mini-series covering tar/cp -P so far and hopefully the >> JP encodings later. Do I understand correctly that for Merijin's use > > Merijn, no second j. You can also call me Tux, as that is what the perl > people do just because of that :) > >> case on HP-UX, we want >> >> - as few extra tools (GNU...) as possible for the run time git >> - may get a few more tools installed to run the test > > You can require as many GNU tools for testing as you like: I'll install > them. I just need to be sure they are not required runtime. (tar, cp) > >> I still don't have a clear picture of the iconv situation: Does your >> iconv library require OLD_ICONV to compile? > > No > >> Is there a reason you want to disable it? > > Yes, if I build a package/depot, and the package depends on iconv, it > is highly likely to fail on the client side after installation, as I do > not control the version of iconv/libiconv installed. > > As HP does not have libiconv installed by default, I have experienced > many tools to be unusable after installation because of that dependency. > > Another reason is that I built 64bitall, as my CURl and SSL environment > is 64bitall for every other project on these systems (including Oracle > related, which *only* ships 64bit objects on HP-UX) and the OpenSource > repos for HP-UX only ship 32bit software (sad, but true). That implies > that I cannot require libiconv.so to be present on the client side. > > I'd like my git to be as standalone as possible OK, so we should use NO_ICONV on HP_UX then. >> Failing so many tests with NO_ICONV is certainly not ideal, but I'm not >> sure we should care to protect so many tests with a prerequisite. > > How feasible is it to isolate those tests into separate test files that > people that know to not use e.g. Asian can safely ignore them? > >> Michael We have the prerequisite mechanism for that, and most probably, the tests are "isolated" already, in the sense that with NO_ICONV, only trivial setup tests succeed for those test files but all "proper" tests fail. But I'll check. Need a good test to set the prerequisite, though. Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interested in helping open source friends on HP-UX?
On Thu, 19 Feb 2015 11:33:01 +0100, Michael J Gruber wrote: > Jeff King venit, vidit, dixit 18.02.2015 19:57: > > On Wed, Feb 18, 2015 at 10:47:16AM -0800, Junio C Hamano wrote: > > > >>> It seems like we could use > >>> > >>> (cd src && tar cf - .) | (cd dst && tar xf -) > >>> > >>> here as a more portable alternative. I don't think we can rely on rsync > >>> being everywhere. > >> > >> Thanks; I wasn't even aware that we used rsync in our tests. We > >> certainly do not want to rely on it. > > > > I don't think we do. > > > > Grepping for rsync in t/, it is mentioned in three places: > > > > 1. In t1509, we use it, but that test script does not run unless you > > set a bunch of environment variables to enable it. > > > > 2. In a sample patch for t4100. Obviously this one doesn't execute. :) > > > > 3. In t5500, to test "rsync:" protocol supported. This is behind a > > check that we can run rsync at all (though it does not properly use > > prereqs or use the normal "skip" procedure). > > > >> Why not "cp -r src dst", though? > > > > I was assuming that the "-P" in the original had some purpose. My "cp > > -r" does not seem to dereference symlinks, but maybe there is something > > I am missing. > > > > -Peff > > There's a symlink in sub that needs to be preserved. > > I'm cooking up a mini-series covering tar/cp -P so far and hopefully the > JP encodings later. Do I understand correctly that for Merijin's use Merijn, no second j. You can also call me Tux, as that is what the perl people do just because of that :) > case on HP-UX, we want > > - as few extra tools (GNU...) as possible for the run time git > - may get a few more tools installed to run the test You can require as many GNU tools for testing as you like: I'll install them. I just need to be sure they are not required runtime. (tar, cp) > I still don't have a clear picture of the iconv situation: Does your > iconv library require OLD_ICONV to compile? No > Is there a reason you want to disable it? Yes, if I build a package/depot, and the package depends on iconv, it is highly likely to fail on the client side after installation, as I do not control the version of iconv/libiconv installed. As HP does not have libiconv installed by default, I have experienced many tools to be unusable after installation because of that dependency. Another reason is that I built 64bitall, as my CURl and SSL environment is 64bitall for every other project on these systems (including Oracle related, which *only* ships 64bit objects on HP-UX) and the OpenSource repos for HP-UX only ship 32bit software (sad, but true). That implies that I cannot require libiconv.so to be present on the client side. I'd like my git to be as standalone as possible > Failing so many tests with NO_ICONV is certainly not ideal, but I'm not > sure we should care to protect so many tests with a prerequisite. How feasible is it to isolate those tests into separate test files that people that know to not use e.g. Asian can safely ignore them? > Michael -- H.Merijn Brand http://tux.nl Perl Monger http://amsterdam.pm.org/ using perl5.00307 .. 5.21 porting perl5 on HP-UX, AIX, and openSUSE http://mirrors.develooper.com/hpux/http://www.test-smoke.org/ http://qa.perl.org http://www.goldmark.org/jeff/stupid-disclaimers/ pgpKDoBd2cl5q.pgp Description: OpenPGP digital signature
Re: Should "git log --decorate" indicate whether the HEAD is detached?
On Wed, Feb 18, 2015 at 5:07 PM, Junio C Hamano wrote: > Julien's "HEAD=master, other" vs "HEAD, master, other" may be > subdued enough to be undistracting, I would guess. I do not think > the distinction between "HEAD = master" and "HEAD -> master" would > be useful, on the other hand. Just to clarify, I suggested these two notations as alternatives for denoting the same state: "HEAD is attached to master". They were not meant to denote different states. Accordingly, a detached HEAD could be denoted by "HEAD, master, other" (i.e. the same as the current output of "git log --decorate"). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Experience with Recovering From User Error (And suggestions for improvements)
Kyle J. McKay venit, vidit, dixit 19.02.2015 02:17: > On Feb 18, 2015, at 01:46, Michael J Gruber wrote: >> Armin Ronacher venit, vidit, dixit 16.02.2015 14:29: >>> Hi, >>> >>> On 16/02/15 13:09, Ævar Arnfjörð Bjarmason wrote: We should definitely make recovery like this harder, but is there a reason for why you don't use "git reset --keep" instead of --hard? >>> This was only the second time in years of git usage that the reset >>> was >>> incorrectly done. I suppose at this point I might try to retrain my >>> muscle memory to type something else :) >>> If we created such hooks for "git reset --hard" we'd just need to expose some other thing as that low-level operation (and break scripts that already rely on it doing the minimal "yes I want to change the tree no matter what" thing), and then we'd just be back to square one in a few years when users started using "git reset --really- hard" (or whatever the flag would be). >>> I don't think that's necessary, I don't think it would make the >>> operation much slower to just make a dangling commit and write out >>> a few >>> blobs. The garbage collect will soon enough take care of that data >>> anyways. But I guess that would need testing on large trees to see >>> how >>> bad that goes. >>> >>> I might look into the git undo thing that was mentioned. >>> >>> Regards, >>> Armin >>> >> >> Are you concerned about the index only, not unstaged worktree changes? >> >> In this case, keeping a reflog for the index may help, and it would >> somehow fit into the overall concept. > > There was this concept of a git stash checkpoint to save work in > progress without creating a normal commit that I read about some time > ago (blog? Git book? -- don't recall) that was basically just this: > >git stash save >git stash apply > > The problem with that is that it touches the working tree and can > trigger rebuilds etc. However, when I ran across the undocumented > "git stash create" command I was able to write a simple git-checkpoint > script [1] that creates a new stash entry without touching the index > or working tree which I find quite handy from time to time. I think that would make for a nice additional command/mode that we could support for git-stash.sh. Alle the pieces are there. > So I think that what Armin originally asked for (create a dangling > commit of changes before reset --hard) could be accomplished simply by > running: > >git checkpoint && git stash drop > >> Otherwise, we would basically need a full stash before a hard reset. >> That's not the first time where we could need a distinction between >> "command run by user" and "command run by script". For the former, we >> could allow overriding default options, re-aliasing internal commands, >> adding expensive safety hooks. For the latter we can't. >> >> It's just that we don't have such a concept yet (other than checking >> tty). > > But of course plugging that into git reset somehow is indeed the > problem since you cannot alias/redefine git commands. > > -Kyle > > [1] https://gist.github.com/mackyle/83b1ba13e263356bdab0 Also, "git stash create" does the tree creation and object creation that we wanted to avoid at least for scripts. And "git reset --hard-but-safe" suffers from the user education problems that have been mentioned already. Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Interested in helping open source friends on HP-UX?
Jeff King venit, vidit, dixit 18.02.2015 19:57: > On Wed, Feb 18, 2015 at 10:47:16AM -0800, Junio C Hamano wrote: > >>> It seems like we could use >>> >>> (cd src && tar cf - .) | (cd dst && tar xf -) >>> >>> here as a more portable alternative. I don't think we can rely on rsync >>> being everywhere. >> >> Thanks; I wasn't even aware that we used rsync in our tests. We >> certainly do not want to rely on it. > > I don't think we do. > > Grepping for rsync in t/, it is mentioned in three places: > > 1. In t1509, we use it, but that test script does not run unless you > set a bunch of environment variables to enable it. > > 2. In a sample patch for t4100. Obviously this one doesn't execute. :) > > 3. In t5500, to test "rsync:" protocol supported. This is behind a > check that we can run rsync at all (though it does not properly use > prereqs or use the normal "skip" procedure). > >> Why not "cp -r src dst", though? > > I was assuming that the "-P" in the original had some purpose. My "cp > -r" does not seem to dereference symlinks, but maybe there is something > I am missing. > > -Peff There's a symlink in sub that needs to be preserved. I'm cooking up a mini-series covering tar/cp -P so far and hopefully the JP encodings later. Do I understand correctly that for Merijin's use case on HP-UX, we want - as few extra tools (GNU...) as possible for the run time git - may get a few more tools installed to run the test I still don't have a clear picture of the iconv situation: Does your iconv library require OLD_ICONV to compile? Is there a reason you want to disable it? Failing so many tests with NO_ICONV is certainly not ideal, but I'm not sure we should care to protect so many tests with a prerequisite. Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [RFH] GSoC 2015 application
Jeff King writes: > I do need somebody to volunteer as backup admin. This doesn't need > to involve any specific commitment, but is mostly about what to do if I > get hit by a bus. If you promise me to try hard not to be hit by a bus and no one else steps in, I can be the backup admin. > Where I really need help now is in the "ideas" page: > > http://git.github.io/SoC-2015-Ideas.html Throwing out a few ideas for discussion, I can write something if people agree. * "git bisect fixed/unfixed", to allow bisecting a fix instead of a regression less painfully. There were already some proposed patches ( https://git.wiki.kernel.org/index.php/SmallProjectsIdeas#git_bisect_fix.2Funfixed ), so it shouldn't be too hard. Perhaps this item can be included in the "git bisect --first-parent" idea (turning it into "git bisect improvements"). * Be nicer to the user on tracked/untracked merge conflicts I've had it on https://git.wiki.kernel.org/index.php/SmallProjectsIdeas#Be_nicer_to_the_user_on_tracked.2Funtracked_merge_conflicts for a while but never got someone to do it. "When merging a commit which has tracked files with the same name as local untracked files, Git refuses to proceed. It could be nice to: - Accept the situation without conflict when the tracked file has the exact same content as the local untracked file (which would become tracked). No data is lost, nothing can be committed accidentally. - Possibly, for fast-forward merges, if a local files belongs to the index but not to the last commit, attempt a merge between the upstream version and the local one (resulting in the same content as if the file had just been committed, but without introducing an extra commit). Recent versions SVN do something similar: on update, it considers added but not committed files like normal tracked files, and attempts a merge of the upstream version with the local one (which always succeeds when the files have identical content). Attempting a merge for non-fast forward cases would probably not make sense: it would mix changes coming from the merge with other changes that do not come from a commit." This shouldn't be technically too hard, but finding which behavior is right, where should things be customizeable and what the default value for the configuration should be will probably lead to interesting discussions. It contains two steps, which is good (all-or-nothing projects are much harder to deal with). The biggest drawback is that the first item may be simple for a GSoC while the second could be both controversial and hard to implement (depending on which solution is taken). > and the list of microprojects: > > http://git.github.io/SoC-2015-Microprojects.html Here are a few ideas, based on https://git.wiki.kernel.org/index.php/SmallProjectsIdeas -- >8 -- >From 513774754872436ea8b7eea63828b804c6a107e7 Mon Sep 17 00:00:00 2001 From: Matthieu Moy Date: Thu, 19 Feb 2015 10:48:06 +0100 Subject: [PATCH] 2015 microproject ideas --- SoC-2015-Microprojects.md | 42 ++ 1 file changed, 42 insertions(+) diff --git a/SoC-2015-Microprojects.md b/SoC-2015-Microprojects.md index 8cb6a8f..1abf595 100644 --- a/SoC-2015-Microprojects.md +++ b/SoC-2015-Microprojects.md @@ -128,3 +128,45 @@ the user wanted. Because --graph is about connected history while --no-walk is about discrete points. Cf. $gmane/216083 + +### Move ~/.git-credentials and ~/.git-credential-cache to ~/.config/git + +Most of git dotfiles can be located, at the user's option, in +~/. or in ~/.config/git/, following the [XDG +standard](http://standards.freedesktop.org/basedir-spec/basedir-spec-latest.html). +~/.git-credentials and ~/.git-credential-cache are still hardcoded as +~/., and should allow using the XDG directory layout too +(~/.git-credentials could be allowed as ~/.config/git/credential and +~/.git-credential-cache could be allowed as ~/.cache/git/credential, +possibly modified by $XDG_CONFIG_HOME and $XDG_CACHE_HOME). + +Each of these files can be a microproject of its own. The suggested +approach is: + +* See how XDG was implemented for other files (run "git log --grep + XDG" in Git's source code) and read the XDG specification. + +* Implement and test the new behavior, without breaking compatibility + with the old behavior. + +* Update the documentation + +### Add configuration options for some commonly used command-line options + +This includes: + +* git am -3 + +* git am -c + +Some people always run the command with these options, and would +prefer to be able to activate them by default in ~/.gitconfig. + +### Add more builtin patterns for userdiff + +"git diff" shows the function name corresponding to each hunk after +the @@ ... @@ line. For common languages (C, HTML, Ada, Matlab, ...), +the way to find the function name is built-in Git's source code as +regular expressio
Re: [RFC] git cat-file "literally" option
On 02/18/2015 07:28 PM, Duy Nguyen wrote:> On Wed, Feb 18, 2015 at 7:50 > Use what sha1_object_info() uses behind the scene. Loose object > encodes object type as a string, you could just print that string and > skip the enum object_type conversion. You probably need special > treatment for packed objects too. See parse_sha1_header() and > unpack_object_header(). Thank you will look into that! On 02/18/2015 09:17 PM, Junio C Hamano wrote: On Wed, Feb 18, 2015 at 5:58 AM, Duy Nguyen wrote: ... skip the enum object_type conversion. You probably need special treatment for packed objects too. I do not think you can store object of type "bogus" in a pack data stream to begin with, so I wouldn't worry about packed objects. "cat-file --literally" that does not take "-t" would not be useful, as the output "cat-file " does not tell what the thing is. Other things like sizes and existence can be inferred once you have an interface to do "cat-file ", so in that sense -e and -s are not essential (this also applies to "cat-file" without --literally). By definition, "--literally -p" would not be able to do anything fancier than just dump the bytes (i.e. what "cat-file " does), as the bogus type is not something the code would know the best external representation for. Thanks for clearing that out. Will work on this for now. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Should "git log --decorate" indicate whether the HEAD is detached?
Junio C Hamano venit, vidit, dixit 18.02.2015 20:49: > Michael J Gruber writes: > >> Yep, it very well is. Also, that approach would tell you which branch is >> checked out, though I don't consider that git log's business. >> >> OTOH, it's "backwards" in the sense that it marks the "ordinary" case >> (HEAD is symref, branch is checked out) specially compared to the >> "exceptional/dangerous" case (HEAD is ref, detached). > > Both are ordinary and there is nothing exceptional or dangerous > about your HEAD temporarily being detached during a "rebase -i" > session, for example. Sure, that's why I put it in quotes. That's only how it is perceived by some users, and I suppose it's that kind of users that we are trying to help here. >> And status, branch >> will point out that latter case more verbously, too. > > Yeah, but as you said, that is not "log"'s business. I still think decorations "detached HEAD" resp. "HEAD" for the two cases are more natural, if we want to include any additional information at all. Just think of: deadbeef (HEAD=master, topicbranch, tag: v1) log/rev-list is about commit objects. All the refs above resolve to the same commit, so why are only two of them equal? In fact, they are very unequal, since HEAD would be "ref: refs/heads/master" whereas master would "deadbeef". They are equal in the other (detached) case! I'm not telling you any news here, I just want to point out how badly misleading that notation is. So, I would suggest to "decorate the decorations", by saying something like "detached HEAD", and maybe some version of "HEAD at master" (I'd prefer just "HEAD") and possibly more info on the tags ("s-tag" or "signed tag" etc). Michael -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html