Re: [PATCH v7 06/13] pack-objects: move in_pack out of struct object_entry
On Fri, Mar 30, 2018 at 10:48 PM, Jeff Kingwrote: > On Sat, Mar 24, 2018 at 07:33:46AM +0100, Nguyễn Thái Ngọc Duy wrote: > >> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c >> index e1244918a5..b41610569e 100644 >> --- a/builtin/pack-objects.c >> +++ b/builtin/pack-objects.c >> @@ -29,6 +29,8 @@ >> #include "list.h" >> #include "packfile.h" >> >> +#define IN_PACK(obj) oe_in_pack(_pack, obj) > > How come this one gets a macro, but the earlier conversions don't? > > I guess the problem is that oe_in_pack() is defined in the generic > pack-objects.h, but _pack is only in builtin/pack-objects.c? > > I wonder if it would be that bad to just say oe_in_pack(_pack, obj) > everywhere. It's longer, but it makes the code slightly less magical to > read. Longer was exactly why I added these macros (with the hope that the macro upper case names already ring a "it's magical" bell). Should I drop all these macros? Some code becomes a lot more verbose though. >> +static void prepare_in_pack_by_idx(struct packing_data *pdata) >> +{ >> + struct packed_git **mapping, *p; >> + int cnt = 0, nr = 1 << OE_IN_PACK_BITS; >> + >> + if (getenv("GIT_TEST_FULL_IN_PACK_ARRAY")) { >> + /* >> + * leave in_pack_by_idx NULL to force in_pack[] to be >> + * used instead >> + */ >> + return; >> + } >> + >> + ALLOC_ARRAY(mapping, nr); >> + mapping[cnt++] = NULL; /* zero index must be mapped to NULL */ > > Why? I guess because index==0 is a sentinel for "we're using the small > index numbers?" No because by default all values in object_entry is zero (or NULL). If I remember correctly, some code will skip setting in_pack pointer to leave it NULL. When we convert it to an index, it should also point to NULL. >> + prepare_packed_git(); >> + for (p = packed_git; p; p = p->next, cnt++) { >> + if (cnt == nr) { >> + free(mapping); >> + return; >> + } >> + p->index = cnt; >> + mapping[cnt] = p; >> + } >> + pdata->in_pack_by_idx = mapping; >> +} > > What happens if we later have to reprepare_packed_git() and end up with > more packs? We only call this for the first pack. > > It may well be handled, but I'm having trouble following the code to see > if it is. And I doubt that case is covered by our test suite (since it > inherently involves a race). I don't think I covered this case. But since "index" field in packed_git should be zero for the new packs, we could check and either add it to in_pack_by_idx[]. >> /* >> + * The size of struct nearly determines pack-objects's memory >> + * consumption. This struct is packed tight for that reason. When you >> + * add or reorder something in this struct, think a bit about this. >> + * > > It's funny to see this warning come in the middle. Should it be part of > the final struct reordering at the end? It was at the end in some version, the I shuffled the patches and forgot about this one :) -- Duy
Re: [PATCH v7 08/13] pack-objects: shrink z_delta_size field in struct object_entry
On Fri, Mar 30, 2018 at 10:59 PM, Jeff Kingwrote: > On Sat, Mar 24, 2018 at 07:33:48AM +0100, Nguyễn Thái Ngọc Duy wrote: > >> We only cache deltas when it's smaller than a certain limit. This limit >> defaults to 1000 but save its compressed length in a 64-bit field. >> Shrink that field down to 16 bits, so you can only cache 65kb deltas. >> Larger deltas must be recomputed at when the pack is written down. > > Unlike the depth, I don't think there's any _inherent_ reason you > couldn't throw, say, 1MB deltas into the cache (if you sized it large > enough). But I doubt such deltas are really all that common. Here are > the top 10 in linux.git: > > $ git cat-file --batch-all-objects --batch-check='%(deltabase) > %(objectsize:disk)' | > grep -v ^0 | sort -k 2nr | head > a02b6794337286bc12c907c33d5d75537c240bd0 769103 > b28d4b64c05da02c5e8c684dcb9422876225ebdc 327116 > 1e98ce86ed19aff9ba721d13a749ff08088c9922 325257 > a02b6794337286bc12c907c33d5d75537c240bd0 240647 > c550d99286c01867dfb26e432417f3106acf8611 177896 > 5977795854f852c2b95dd023fd03cace023ee41c 119737 > 4ccf9681c45d01d17376f7e0d266532a4460f5f8 112671 > b39fb6821faa9e7bc36de738152a2817b4bf3654 112657 > 2645d6239b74bebd661436762e819b831095b084 103980 > b8ce7fe5d8def58dc63b7ae099eff7bd07e4e845 101014 > > It's possible some weird workload would want to tweak this. Say you were > storing a ton of delta-capable files that were big and always differed > by a megabyte. And it was somehow really important to you to tradeoff > memory for CPU during the write phase of a pack. We're not short on spare bits so I will try to raise this limit to 1MB (not because you mentioned 1MB, but because the largest size in your output is close to 1MB). -- Duy
Re: [PATCH v7 10/13] pack-objects: clarify the use of object_entry::size
On Fri, Mar 30, 2018 at 11:04 PM, Jeff Kingwrote: > The subject says "clarify" so I was a little surprised to see code > changes. It looks like we're just avoiding reassigning on top of the > value repeatedly, which is part of that clarification. It looks like a > noop to me. Oh well... I was counting on the new name (in_pack_size, which follows in_pack_type naming convention) to emphasize it (and the new "delta size" comment to point out where in_pack_size contains a delta size. -- Duy
Re: [PATCH v7 12/13] pack-objects: shrink delta_size field in struct object_entry
On Fri, Mar 30, 2018 at 11:24 PM, Jeff Kingwrote: > On Sat, Mar 24, 2018 at 07:33:52AM +0100, Nguyễn Thái Ngọc Duy wrote: >> @@ -2004,10 +2006,12 @@ static int try_delta(struct unpacked *trg, struct >> unpacked *src, >> delta_buf = create_delta(src->index, trg->data, trg_size, _size, >> max_size); >> if (!delta_buf) >> return 0; >> + if (delta_size >= (1 << OE_DELTA_SIZE_BITS)) >> + return 0; > > This is the other spot that needs to be "1U". > > How come this doesn't get a pdata->oe_delta_size_limit like we have > pdata->oe_size_limit? Would we want a matching > $GIT_TEST_OE_DELTA_SIZE_BITS to test it, too? Probably. This change does not look as risky as the others (no complicated fallback). But without $GIT_TEST_OE_DELTA_SIZE_BITS it's hard to know how the new code reacts when we get over the limit. I will add it. -- Duy
Re: [PATCH v7 13/13] pack-objects: reorder members to shrink struct object_entry
On Fri, Mar 30, 2018 at 11:26 PM, Jeff Kingwrote: > On Sat, Mar 24, 2018 at 07:33:53AM +0100, Nguyễn Thái Ngọc Duy wrote: > >> Previous patches leave lots of holes and padding in this struct. This >> patch reorders the members and shrinks the struct down to 80 bytes >> (from 136 bytes, before any field shrinking is done) with 16 bits to >> spare (and a couple more in in_pack_header_size when we really run out >> of bits). > > Out of curiosity, did you count this yourself, or did you double-check > with a few compilers to make sure they all produce the same result? I used pahole though only with .o files created by gcc 64-bit. I'll try the 32-bit version and clang as well. -- Duy
Investment opportunity
Greetings, Please find the content of this mail very confidential. my name is Yi Tan, I work with a Bank here in Hong Kong. I decided to contact you for an opportunity to invest in any lucrative business in your country, I am willing to offer you 40% as my business partner. We also offer a quick loan at low interest rate, if you are interested please reply me for more information on my private e-mail: yitanelect...@gmail.com Sincerely: Yi Tan
Compliment of the day ,
-- Hi dear. It is wonderful to contact you, I want us to have correspondence. I wish you will have the desire so that we can get acquainted to each other. Life itself is a mystery, you never know where it might lead you. I'm Tracy.William, a French American . I will be pleased if you reply through.(tracy.medce...@outlook.com) With Love Tracy --
Re: [RFC PATCH 0/1] bdl-lib.sh: add bash debug logger
On Fri, Mar 30, 2018 at 3:34 AM, Johannes Schindelinwrote: > Hi Wink, > > On Tue, 27 Mar 2018, Wink Saville wrote: > >> Add bdl-lib.sh which provides functions to assit in debugging git >> shell scripts and tests. > > Interesting idea. It is Bash-only, though... (and it is not a secret that > I want to discourage using Unix shell scripts in Git's production code, > they are a decent way to prototype things, but they fall short of being > robust and portable in practice, and don't get me started on speed > issues). > > So rather than spending time on making it easier to debug shell scripts, I > would love to see us going into the direction of a consistent C source > code. I still believe that we can get there, and that the benefits are > worth the (huge) effort. > > Ciao, > Johannes I totally agree the code base should use primarily one language! But that's not the case now and "bdl" gave me insight into the workings of rebase--interactive and I found little guidance on how to debug or learn the code base. So I thought I'd see if there was interest in what I'd done. If I were to make it non-bash specific would there be any interest? -- Wink
Re: [PATCH] fast-import: fix a sparse 'NULL pointer' warning
Ramsay Joneswrites: > This was going to be an email to Jameson, but I wasn't fast enough! :( > > This was originally written against the 'pu' branch, but since the > 'jm/mem-pool' branch has graduated to 'next', I have rebased on 'next'. > > Perhaps this could be squashed into that branch when you re-build > the 'next' branch at the beginning of the next cycle? Thanks. Yes, that would be the most sensible thing to do, I would think, as the topic itself (like many other branches that are merged to 'next' during the pre-release feature freeze period) is not ready for the 'master' branch yet anyway.
[PATCH] fast-import: fix a sparse 'NULL pointer' warning
Commit a8dfa11562 ("fast-import: introduce mem_pool type", 2018-03-26) introduces a 'mem_pool' type, along with a file-local global symbol ('fi_mem_poll') which is initialised in its declaration. This causes sparse to issue a warning, thus: SP fast-import.c fast-import.c:301:40: warning: Using plain integer as NULL pointer In order to suppress the warning, replace the '0' used to initialise the 'mp_block' field (of type 'struct mp_block *') with NULL. Signed-off-by: Ramsay Jones--- Hi Junio, This was going to be an email to Jameson, but I wasn't fast enough! :( This was originally written against the 'pu' branch, but since the 'jm/mem-pool' branch has graduated to 'next', I have rebased on 'next'. Perhaps this could be squashed into that branch when you re-build the 'next' branch at the beginning of the next cycle? ATB, Ramsay Jones fast-import.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fast-import.c b/fast-import.c index 0aa148ea4..38356e293 100644 --- a/fast-import.c +++ b/fast-import.c @@ -298,7 +298,7 @@ static int global_argc; static const char **global_argv; /* Memory pools */ -static struct mem_pool fi_mem_pool = {0, 2*1024*1024 - sizeof(struct mp_block), 0 }; +static struct mem_pool fi_mem_pool = {NULL, 2*1024*1024 - sizeof(struct mp_block), 0 }; /* Atom management */ static unsigned int atom_table_sz = 4451; -- 2.16.0
Re: [GSoC][PATCH v5] test: avoid pipes in git related commands for test
Eric Sunshinewrites: > On Tue, Mar 27, 2018 at 1:31 PM, Pratik Karki wrote: >> Avoid using pipes downstream of Git commands since the exit codes >> of commands upstream of pipes get swallowed, thus potentially >> hiding failure of those commands. Instead, capture Git command >> output to a file and apply the downstream command(s) to that file. >> >> >> Signed-off-by: Pratik Karki > > Unnecessary double blank line above sign-off. "git am" would automatically trigger stripspace, which would eat the extra blank line from that two-blank-line block. > Aside from that minor hiccup (which Junio fixed when queuing), this > iteration addresses all my review comments[1] from the previous round > and does not seem to introduce any new issues. Thanks for a review.
Re: What's cooking in git.git (Mar 2018, #06; Fri, 30)
On 3/30/2018 4:38 PM, Junio C Hamano wrote: * jh/json-writer (2018-03-28) 1 commit - json_writer: new routines to create data in JSON format Is this ready for 'next'? Yes, I believe it is. This has the V6 fixup for the HEREDOC with leading whitespace, so I think we're good. Thanks Jeff
Re: [GSoC][PATCH v5] test: avoid pipes in git related commands for test
On Tue, Mar 27, 2018 at 1:31 PM, Pratik Karkiwrote: > Avoid using pipes downstream of Git commands since the exit codes > of commands upstream of pipes get swallowed, thus potentially > hiding failure of those commands. Instead, capture Git command > output to a file and apply the downstream command(s) to that file. > > > Signed-off-by: Pratik Karki Unnecessary double blank line above sign-off. Aside from that minor hiccup (which Junio fixed when queuing), this iteration addresses all my review comments[1] from the previous round and does not seem to introduce any new issues. Thanks. [1]: https://public-inbox.org/git/CAPig+cS3GjYo+5C_W6WqzK3RP=W+918E6Cz=fsvhky6ewce...@mail.gmail.com/
Re: What's cooking in git.git (Mar 2018, #06; Fri, 30)
Junio C Hamanowrites: > -- > [Graduated to "master"] > > * jh/partial-clone (2018-03-25) 1 commit > (merged to 'next' on 2018-03-28 at 2a0a7aef8e) > + unpack-trees: release oid_array after use in check_updates() > > Hotfix. Not listed in the above, but this round also merges the hotfix c7620bd0 ("upload-pack: disable object filtering when disabled by config", 2018-03-28) from Jonathan Nieder that makes sure that "uploadpack.allowfilter" does disable the feature even when the client makes an unsolicited request to trigger the "filter" feature. It is not (yet) clear how I screwed up this report; sorry about that.
Re: [PATCH v7 13/13] pack-objects: reorder members to shrink struct object_entry
On Sat, Mar 24, 2018 at 07:33:53AM +0100, Nguyễn Thái Ngọc Duy wrote: > Previous patches leave lots of holes and padding in this struct. This > patch reorders the members and shrinks the struct down to 80 bytes > (from 136 bytes, before any field shrinking is done) with 16 bits to > spare (and a couple more in in_pack_header_size when we really run out > of bits). Out of curiosity, did you count this yourself, or did you double-check with a few compilers to make sure they all produce the same result? So having read the whole thing now, I think most of my original concerns have been addressed. I do think readability takes a hit, but it's not _too_ bad. There are a few things that have become more brittle, but I can't think of anything on the horizon that would bite us. -Peff
Re: [PATCH v7 12/13] pack-objects: shrink delta_size field in struct object_entry
On Sat, Mar 24, 2018 at 07:33:52AM +0100, Nguyễn Thái Ngọc Duy wrote: > Allowing a delta size of 64 bits is crazy. Shrink this field down to > 31 bits with one overflow bit. > > If we find an existing delta larger than 2GB, we do not cache > delta_size at all and will get the value from oe_size(), potentially > from disk if it's larger than 4GB. Since we have a fallback, we can put this slider wherever we want. Probably something like 20 bits would be plenty, if we ever needed to squeeze in a few more small-bit items. > @@ -2004,10 +2006,12 @@ static int try_delta(struct unpacked *trg, struct > unpacked *src, > delta_buf = create_delta(src->index, trg->data, trg_size, _size, > max_size); > if (!delta_buf) > return 0; > + if (delta_size >= (1 << OE_DELTA_SIZE_BITS)) > + return 0; This is the other spot that needs to be "1U". How come this doesn't get a pdata->oe_delta_size_limit like we have pdata->oe_size_limit? Would we want a matching $GIT_TEST_OE_DELTA_SIZE_BITS to test it, too? -Peff
Re: [PATCH v7 11/13] pack-objects: shrink size field in struct object_entry
On Sat, Mar 24, 2018 at 07:33:51AM +0100, Nguyễn Thái Ngọc Duy wrote: > It's very very rare that an uncompressed object is larger than 4GB > (partly because Git does not handle those large files very well to > begin with). Let's optimize it for the common case where object size > is smaller than this limit. > > Shrink size field down to 32 bits [1] and one overflow bit. If the > size is too large, we read it back from disk. As noted in the previous > patch, we need to return the delta size instead of canonical size when > the to-be-reused object entry type is a delta instead of a canonical > one. > > Add two compare helpers that can take advantage of the overflow > bit (e.g. if the file is 4GB+, chances are it's already larger than > core.bigFileThreshold and there's no point in comparing the actual > value). > > Another note about oe_get_size_slow(). This function MUST be thread > safe because SIZE() macro is used inside try_delta() which may run in > parallel. Outside parallel code, no-contention locking should be dirt > cheap (or insignificant compared to i/o access anyway). To exercise > this code, it's best to run the test suite with something like > > make test GIT_TEST_OE_SIZE_BITS=2 > > which forces this code on all objects larger than 3 bytes. OK, makes sense. Since we need it to be thread-safe, we have to use read_lock(). Which means that oe_get_size_slow() is defined in builtin/pack-objects.c. But the object_entry is defined in the more-generic pack-objects.h. So anybody besides builtin/pack-objects.c will have to implement their own fallback when e->size_valid isn't true. Which is a little odd, but I guess nobody else needs that field. It might bite us in the future, but I'm willing to cross my fingers for now (the pack-objects.h header is really just there to support the bitmap writing code, but even that could in theory all get shoved into a single translation unit if we had to). > [1] it's actually already 32 bits on Windows And linux-i386. :) > +unsigned long oe_get_size_slow(struct packing_data *pack, > +const struct object_entry *e) > +{ I think I already replied about this earlier, so I'll skim over it this time. > diff --git a/pack-objects.c b/pack-objects.c > index 13f2b2bff2..59c6e40a02 100644 > --- a/pack-objects.c > +++ b/pack-objects.c > @@ -120,8 +120,15 @@ struct object_entry *packlist_alloc(struct packing_data > *pdata, > { > struct object_entry *new_entry; > > - if (!pdata->nr_objects) > + if (!pdata->nr_objects) { > prepare_in_pack_by_idx(pdata); > + if (getenv("GIT_TEST_OE_SIZE_BITS")) { > + int bits = atoi(getenv("GIT_TEST_OE_SIZE_BITS"));; > + pdata->oe_size_limit = 1 << bits; > + } > + if (!pdata->oe_size_limit) > + pdata->oe_size_limit = 1 << OE_SIZE_BITS; > + } This needs to be "1U << OE_SIZE_BITS". Shifting a signed integer 31 bits is undefined. No, I'm not that clever or careful myself. I ran the whole test suite with SANITIZE=address,undefined and it turned this up, as well as a similar case for OE_DELTA_SIZE_BITS. > + uint32_t size_:OE_SIZE_BITS; > + uint32_t size_valid:1; A uint32_t bitfield? Would it make more sense to just call these "unsigned", since we're specifying the precision already? > +unsigned long oe_get_size_slow(struct packing_data *pack, > +const struct object_entry *e); > +static inline unsigned long oe_size(struct packing_data *pack, > + const struct object_entry *e) > +{ > + if (e->size_valid) > + return e->size_; > + > + return oe_get_size_slow(pack, e); > +} If oe_get_size_slow() fails to find an object's size, it dies. I'm trying to think of whether that might hit funny corner cases with racing. I don't _think_ so, because if the object truly goes away, we'd be screwed during the writing phase anyway. > +static inline int oe_size_less_than(struct packing_data *pack, > + const struct object_entry *lhs, > + unsigned long rhs) > +{ > + if (lhs->size_valid) > + return lhs->size_ < rhs; > + if (rhs < pack->oe_size_limit) /* rhs < 2^x <= lhs ? */ > + return 0; > + return oe_get_size_slow(pack, lhs) < rhs; > +} Clever. > +static inline void oe_set_size(struct packing_data *pack, > +struct object_entry *e, > +unsigned long size) > +{ > + if (size < pack->oe_size_limit) { > + e->size_ = size; > + e->size_valid = 1; > + } else { > + e->size_valid = 0; > + if (oe_get_size_slow(pack, e) != size) > + die("BUG: 'size' is supposed to be the object size!"); > + } > +} That's an expensive assertion. But I guess this isn't supposed to happen very frequently, so
Re: [PATCH v7 10/13] pack-objects: clarify the use of object_entry::size
On Sat, Mar 24, 2018 at 07:33:50AM +0100, Nguyễn Thái Ngọc Duy wrote: > While this field most of the time contains the canonical object size, > there is one case it does not: when we have found that the base object > of the delta in question is also to be packed, we will very happily > reuse the delta by copying it over instead of regenerating the new > delta. > > "size" in this case will record the delta size, not canonical object > size. Later on in write_reuse_object(), we reconstruct the delta > header and "size" is used for this purpose. When this happens, the > "type" field contains a delta type instead of a canonical type. > Highlight this in the code since it could be tricky to see. Thanks for digging down here. I have definitely been confused by this in the past. The subject says "clarify" so I was a little surprised to see code changes. It looks like we're just avoiding reassigning on top of the value repeatedly, which is part of that clarification. It looks like a noop to me. -Peff
Re: [PATCH v7 08/13] pack-objects: shrink z_delta_size field in struct object_entry
On Sat, Mar 24, 2018 at 07:33:48AM +0100, Nguyễn Thái Ngọc Duy wrote: > We only cache deltas when it's smaller than a certain limit. This limit > defaults to 1000 but save its compressed length in a 64-bit field. > Shrink that field down to 16 bits, so you can only cache 65kb deltas. > Larger deltas must be recomputed at when the pack is written down. Unlike the depth, I don't think there's any _inherent_ reason you couldn't throw, say, 1MB deltas into the cache (if you sized it large enough). But I doubt such deltas are really all that common. Here are the top 10 in linux.git: $ git cat-file --batch-all-objects --batch-check='%(deltabase) %(objectsize:disk)' | grep -v ^0 | sort -k 2nr | head a02b6794337286bc12c907c33d5d75537c240bd0 769103 b28d4b64c05da02c5e8c684dcb9422876225ebdc 327116 1e98ce86ed19aff9ba721d13a749ff08088c9922 325257 a02b6794337286bc12c907c33d5d75537c240bd0 240647 c550d99286c01867dfb26e432417f3106acf8611 177896 5977795854f852c2b95dd023fd03cace023ee41c 119737 4ccf9681c45d01d17376f7e0d266532a4460f5f8 112671 b39fb6821faa9e7bc36de738152a2817b4bf3654 112657 2645d6239b74bebd661436762e819b831095b084 103980 b8ce7fe5d8def58dc63b7ae099eff7bd07e4e845 101014 It's possible some weird workload would want to tweak this. Say you were storing a ton of delta-capable files that were big and always differed by a megabyte. And it was somehow really important to you to tradeoff memory for CPU during the write phase of a pack. That seems pretty unlikely to bite anybody (and that was the best I could come up with as a devil's advocate against it). > Signed-off-by: Nguyễn Thái Ngọc Duy> --- > Documentation/config.txt | 3 ++- > builtin/pack-objects.c | 22 -- > pack-objects.h | 3 ++- > 3 files changed, 20 insertions(+), 8 deletions(-) Patch looks OK. -Peff
Re: What's cooking in git.git (Mar 2018, #06; Fri, 30)
On 03/30, Junio C Hamano wrote: > * tg/worktree-add-existing-branch (2018-03-27) 6 commits > - t2025: rename now outdated branch name > - worktree: teach "add" to check out existing branches > - worktree: factor out dwim_branch function > - worktree: remove force_new_branch from struct add_opts > - worktree: be clearer when "add" dwim-ery kicks in > - worktree: improve message when creating a new worktree > > "git worktree add" learned to check out an existing branch. > > Is this ready for 'next'? Not quite yet. Eric spotted some UI deficiencies which I'm currently trying to address. I hope to re-roll this in a few days with those deficiencies fixed.
Re: [PATCH v7 07/13] pack-objects: refer to delta objects by index instead of pointer
On Sat, Mar 24, 2018 at 07:33:47AM +0100, Nguyễn Thái Ngọc Duy wrote: > These delta pointers always point to elements in the objects[] array > in packing_data struct. We can only hold maximum 4G of those objects > because the array size in nr_objects is uint32_t. We could use > uint32_t indexes to address these elements instead of pointers. On > 64-bit architecture (8 bytes per pointer) this would save 4 bytes per > pointer. > > Convert these delta pointers to indexes. Since we need to handle NULL > pointers as well, the index is shifted by one [1]. > > [1] This means we can only index 2^32-2 objects even though nr_objects > could contain 2^32-1 objects. It should not be a problem in > practice because when we grow objects[], nr_alloc would probably > blow up long before nr_objects hits the wall. Hmm, that may be something we eventually fix. I suspect all of this code does some pretty horrible things as you approach 2^32 objects, though. I've never tried to make such a pack, but it may be within the realm of possibility. The .idx file would be 80+GB, but the packfile might not be much bigger if specially crafted. I guess that's outside the realm of reasonable, though, so we can assume that nobody would _really_ want to do that anytime soon. And anything malicious would probably die long before this code triggers. > Signed-off-by: Nguyễn Thái Ngọc Duy> --- > builtin/pack-objects.c | 116 ++--- > pack-objects.h | 67 ++-- > 2 files changed, 124 insertions(+), 59 deletions(-) The patch itself looks OK. This is one of the nicer ones, because it really doesn't involve any extra storage management, just some accessor functions. -Peff
Re: [PATCH v7 06/13] pack-objects: move in_pack out of struct object_entry
On Sat, Mar 24, 2018 at 07:33:46AM +0100, Nguyễn Thái Ngọc Duy wrote: > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index e1244918a5..b41610569e 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -29,6 +29,8 @@ > #include "list.h" > #include "packfile.h" > > +#define IN_PACK(obj) oe_in_pack(_pack, obj) How come this one gets a macro, but the earlier conversions don't? I guess the problem is that oe_in_pack() is defined in the generic pack-objects.h, but _pack is only in builtin/pack-objects.c? I wonder if it would be that bad to just say oe_in_pack(_pack, obj) everywhere. It's longer, but it makes the code slightly less magical to read. > @@ -1074,7 +1076,7 @@ static void create_object_entry(const struct object_id > *oid, > else > nr_result++; > if (found_pack) { > - entry->in_pack = found_pack; > + oe_set_in_pack(_pack, entry, found_pack); > entry->in_pack_offset = found_offset; > } it's funny to see in_pack as an external thing, but in_pack_offset still in the struct. I guess there's nothing to be gained there, since the offset really does need to be individual (and large). > diff --git a/cache.h b/cache.h > index 862bdff83a..b90feb3802 100644 > --- a/cache.h > +++ b/cache.h > @@ -1635,6 +1635,7 @@ extern struct packed_git { > int index_version; > time_t mtime; > int pack_fd; > + int index; /* for builtin/pack-objects.c */ > unsigned pack_local:1, >pack_keep:1, >freshened:1, It's pretty gross to infect this global struct. But I'm not sure there's an easier way to do it with constant-time lookups. You'd have to build the packed_git index preemptively in pack-objects, and then always just pass around the index numbers. And even that is kind of dicey, since the packed_git list can grow while we're running. The alternative is a hash table mapping packed_git pointers into numeric indices. Yuck. > +static void prepare_in_pack_by_idx(struct packing_data *pdata) > +{ > + struct packed_git **mapping, *p; > + int cnt = 0, nr = 1 << OE_IN_PACK_BITS; > + > + if (getenv("GIT_TEST_FULL_IN_PACK_ARRAY")) { > + /* > + * leave in_pack_by_idx NULL to force in_pack[] to be > + * used instead > + */ > + return; > + } > + > + ALLOC_ARRAY(mapping, nr); > + mapping[cnt++] = NULL; /* zero index must be mapped to NULL */ Why? I guess because index==0 is a sentinel for "we're using the small index numbers?" > + prepare_packed_git(); > + for (p = packed_git; p; p = p->next, cnt++) { > + if (cnt == nr) { > + free(mapping); > + return; > + } > + p->index = cnt; > + mapping[cnt] = p; > + } > + pdata->in_pack_by_idx = mapping; > +} What happens if we later have to reprepare_packed_git() and end up with more packs? We only call this for the first pack. It may well be handled, but I'm having trouble following the code to see if it is. And I doubt that case is covered by our test suite (since it inherently involves a race). > /* > + * The size of struct nearly determines pack-objects's memory > + * consumption. This struct is packed tight for that reason. When you > + * add or reorder something in this struct, think a bit about this. > + * It's funny to see this warning come in the middle. Should it be part of the final struct reordering at the end? -Peff
What's cooking in git.git (Mar 2018, #06; Fri, 30)
Here are the topics that have been cooking. Commits prefixed with '-' are only in 'pu' (proposed updates) while commits prefixed with '+' are in 'next'. The ones marked with '.' do not appear in any of the integration branches, but I am still holding onto them. Git 2.17 final is expected to be tagged by early next week. You can find the changes described here in the integration branches of the repositories listed at http://git-blame.blogspot.com/p/git-public-repositories.html -- [Graduated to "master"] * jh/partial-clone (2018-03-25) 1 commit (merged to 'next' on 2018-03-28 at 2a0a7aef8e) + unpack-trees: release oid_array after use in check_updates() Hotfix. -- [New Topics] * rs/status-with-removed-submodule (2018-03-28) 1 commit (merged to 'next' on 2018-03-30 at 8a7b618bc1) + submodule: check for NULL return of get_submodule_ref_store() "git submodule status" misbehaved on a submodule that has been removed from the working tree. Will cook in 'next'. * lv/tls-1.3 (2018-03-29) 1 commit (merged to 'next' on 2018-03-30 at 4f13731408) + http: allow use of TLS 1.3 When built with more recent cURL, GIT_SSL_VERSION can now specify "tlsv1.3" as its value. Will cook in 'next'. * nd/warn-more-for-devs (2018-03-29) 3 commits - Makefile: add EAGER_DEVELOPER mode - Makefile: detect compiler and enable more warnings in DEVELOPER=1 - connect.c: mark die_initial_contact() NORETURN The build procedure "make DEVELOPER=YesPlease" learned to enable a bit more warning options depending on the compiler used to help developers more. There also is "make EAGER_DEVELOPER=YesPlease" available now, for those who want to help fixing warnings we usually ignore. Will merge to 'next'. * sb/submodule-move-nested (2018-03-29) 6 commits - submodule: fixup nested submodules after moving the submodule - submodule-config: remove submodule_from_cache - submodule-config: add repository argument to submodule_from_{name, path} - submodule-config: allow submodule_free to handle arbitrary repositories - grep: remove "repo" arg from non-supporting funcs - submodule.h: drop declaration of connect_work_tree_and_git_dir (this branch uses nd/remove-ignore-env-field and sb/object-store; is tangled with sb/packfiles-in-repository.) Moving a submodule that itself has submodule in it with "git mv" forgot to make necessary adjustment to the nested sub-submodules; now the codepath learned to recurse into the submodules. * tb/config-type (2018-03-29) 1 commit - builtin/config.c: prefer `--type=bool` over `--bool`, etc. (this branch is used by tb/config-default.) The "git config" command uses separate options e.g. "--int", "--bool", etc. to specify what type the caller wants the value to be interpreted as. A new "--type=" option has been introduced, which would make it cleaner to define new types. Will merge to 'next'. * tb/config-default (2018-03-29) 3 commits - builtin/config: introduce `color` type specifier - config.c: introduce 'git_config_color' to parse ANSI colors - builtin/config: introduce `--default` (this branch uses tb/config-type.) "git config --get" learned the "--default" option, to help the calling script. Building on top of the tb/config-type topic, the "git config" learns "--type=color" type. Taken together, you can do things like "git config --get foo.color --default blue" and get the ANSI color sequence for the color given to foo.color variable, or "blue" if the variable does not exist. * eb/cred-helper-ignore-sigpipe (2018-03-29) 1 commit (merged to 'next' on 2018-03-30 at c48e98c1b1) + credential: ignore SIGPIPE when writing to credential helpers When credential helper exits very quickly without reading its input, it used to cause Git to die with SIGPIPE, which has been fixed. Will cook in 'next'. * jk/flockfile-stdio (2018-03-30) 1 commit - config: move flockfile() closer to unlocked functions * jk/relative-directory-fix (2018-03-30) 5 commits - refs: use chdir_notify to update cached relative paths - set_work_tree: use chdir_notify - add chdir-notify API - trace.c: export trace_setup_key - set_git_dir: die when setenv() fails -- [Stalled] * sb/blame-color (2018-02-13) 3 commits - builtin/blame: highlight recently changed lines - builtin/blame: add option to color metadata fields separately - builtin/blame: dim uninteresting metadata Expecting a reroll. cf. https://public-inbox.org/git/20171110011002.10179-1-sbel...@google.com/#t error messages are funny, can segfault, ... * av/fsmonitor-updates (2018-01-04) 6 commits - fsmonitor: use fsmonitor data in `git diff` - fsmonitor: remove debugging lines from t/t7519-status-fsmonitor.sh - fsmonitor: make output of test-dump-fsmonitor more concise - fsmonitor: update helper tool, now that flags are filled later - fsmonitor:
Re: [PATCH v7 05/13] pack-objects: move in_pack_pos out of struct object_entry
On Sat, Mar 24, 2018 at 07:33:45AM +0100, Nguyễn Thái Ngọc Duy wrote: > This field is only need for pack-bitmap, which is an optional > feature. Move it to a separate array that is only allocated when > pack-bitmap is used (it's not freed in the same way that objects[] is > not). I had trouble parsing the parenthetical in the last sentence. It does make sense if you read it hard enough, but maybe: (like objects[], it is not freed, since we need it until the end of the process) would be more clear? The patch itself seems OK. -Peff
Re: [PATCH v7 04/13] pack-objects: use bitfield for object_entry::depth
On Sat, Mar 24, 2018 at 07:33:44AM +0100, Nguyễn Thái Ngọc Duy wrote: > Because of struct packing from now on we can only handle max depth > 4095 (or even lower when new booleans are added in this struct). This > should be ok since long delta chain will cause significant slow down > anyway. OK. This is the first user-facing change, but I think it really shouldn't hurt anybody. My experiments a while ago showed that chains longer than 50 aren't really worth it, but so this could probably shrink to something like 8 bits if we really needed it to. > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index 83f8154865..205e1f646c 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -3068,6 +3068,10 @@ int cmd_pack_objects(int argc, const char **argv, > const char *prefix) > if (pack_to_stdout != !base_name || argc) > usage_with_options(pack_usage, pack_objects_options); > > + if (depth >= (1 << OE_DEPTH_BITS)) > + die(_("delta chain depth %d is greater than maximum limit %d"), > + depth, (1 << OE_DEPTH_BITS) - 1); Since this is introducing a new limit, I wonder if we should issue a warning and just clamp it to the maximum value. That would be kinder to people who may have existing (admittedly dumb) setups. -Peff
Re: [PATCH v3 2/3] config.c: introduce 'git_config_color' to parse ANSI colors
On Wed, Mar 28, 2018 at 9:16 PM, Taylor Blauwrote: > In preparation for adding `--color` to the `git-config(1)` builtin, > let's introduce a color parsing utility, `git_config_color` in a similar > fashion to `git_config_`. Did you mean s/--color/--type=color/ ? > Signed-off-by: Taylor Blau
Re: [PATCH v3 1/3] builtin/config: introduce `--default`
On Wed, Mar 28, 2018 at 9:16 PM, Taylor Blauwrote: > For some use cases, callers of the `git-config(1)` builtin would like to > fallback to default values when the slot asked for does not exist. In > addition, users would like to use existing type specifiers to ensure > that values are parsed correctly when they do exist in the > configuration. > > For example, to fetch a value without a type specifier and fallback to > `$fallback`, the following is required: > > $ git config core.foo || echo "$fallback" > > This is fine for most values, but can be tricky for difficult-to-express > `$fallback`'s, like ANSI color codes. > > This motivates `--get-color`, which is a one-off exception to the normal > type specifier rules wherein a user specifies both the configuration > slot and an optional fallback. Both are formatted according to their > type specifier, which eases the burden on the user to ensure that values > are correctly formatted. > > This commit (and those following it in this series) aim to eventually > replace `--get-color` with a consistent alternative. By introducing > `--default`, we allow the `--get-color` action to be promoted to a > `--color` type specifier, retaining the "fallback" behavior via the > `--default` flag introduced in this commit. I'm confused. The cover letter said that this iteration no longer introduces a --color option (favoring instead --type=color), but this commit message still talks about --color. Did you mean s/--color/--type=color/ ? > For example, we aim to replace: > > $ git config --get-color slot [default] [...] > > with: > > $ git config --default default --color slot [...] Ditto: s/--color/--type=color/ > Values filled by `--default` behave exactly as if they were present in > the affected configuration file; they will be parsed by type specifiers > without the knowledge that they are not themselves present in the > configuration. > > Specifically, this means that the following will work: > > $ git config --int --default 1M does.not.exist > 1048576 > > In subsequent commits, we will offer `--color`, which (in conjunction > with `--default`) will be sufficient to replace `--get-color`. Ditto: s/--color/--type=color/ > Signed-off-by: Taylor Blau
Re: [PATCH v7 03/13] pack-objects: use bitfield for object_entry::dfs_state
On Sat, Mar 24, 2018 at 07:33:43AM +0100, Nguyễn Thái Ngọc Duy wrote: > Signed-off-by: Nguyễn Thái Ngọc Duy> --- This probably needs some explanation for people digging in history (even if it's "this is to shrink the size as part of a larger struct-shrinking effort" so they know to dig around in the nearby history). > diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c > index 647c01ea34..83f8154865 100644 > --- a/builtin/pack-objects.c > +++ b/builtin/pack-objects.c > @@ -3049,6 +3049,9 @@ int cmd_pack_objects(int argc, const char **argv, const > char *prefix) > OPT_END(), > }; > > + if (DFS_NUM_STATES > (1 << OE_DFS_STATE_BITS)) > + die("BUG: too many dfs states, increase OE_DFS_STATE_BITS"); I thought this was off-by-one at first, but NUM_STATES is one more than the highest state, so it's right. I suspect all of the dfs and depth stuff could be pulled into a separate array that is used only during that depth search. But as you have it squished down here, I think we may be getting it "for free" in between other non-word-aligned values in the struct. -Peff
Re: [PATCH v7 02/13] pack-objects: turn type and in_pack_type to bitfields
On Sat, Mar 24, 2018 at 07:33:42AM +0100, Nguyễn Thái Ngọc Duy wrote: > +static inline void oe_set_type(struct object_entry *e, > +enum object_type type) > +{ > + if (type >= OBJ_ANY) > + die("BUG: OBJ_ANY cannot be set in pack-objects code"); A minor nit, but this (and other new assertions) should probably be BUG(). > + e->type_valid = type >= OBJ_NONE; > + e->type_ = (unsigned)type; Hmm, so if !e->type_valid, then we may write utter garbage into e->type_. That's OK, since everybody will access it via oe_type(), but I wonder if we could trigger weird compiler behavior. I guess the unsigned cast makes the truncation well-defined. -Peff
Re: [PATCH v2 2/5] trace.c: export trace_setup_key
On Fri, Mar 30, 2018 at 12:50:50PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > On Fri, Mar 30, 2018 at 12:46:26PM -0700, Junio C Hamano wrote: > > > >> Jeff King writes: > >> > >> > From: Nguyễn Thái Ngọc Duy > >> > > >> > This is so that we can print traces based on this key outside trace.c. > >> > >> "this key" meaning...? GIT_TRACE_SETUP? > > > > I think "based on trace_setup_key". > > > > -Peff > > Yeah, I read, but did not pay enough attention to, the subject X-<. To be fair, one of our guidelines is that the commit message should not overly rely on the subject line. Though I am certainly guilty of starting many a message with "This". Perhaps: The setup-tracing code is static-local to trace.c. In preparation for new GIT_TRACE_SETUP code outside of trace.c, let's make the trace_key globally available. would be a better commit message. -Peff
Re: [PATCH] config: move flockfile() closer to unlocked functions
Jeff Kingwrites: > On Fri, Mar 30, 2018 at 09:04:13PM +0200, Johannes Schindelin wrote: > >> > Probably the flockfile should go into do_config_from_file(), where we >> > specify to use the unlocked variants. >> >> Ah, that makes sense now! I am glad I could also help ;-) > > :) > >> > Yeah, I'll wait to see how your refactor turns out. >> >> I don't think I'll touch too much in that part of the code. My changes >> should not cause merge conflicts with a patch moving the >> flockfile()/funlockfile() calls to do_config_from_file(). > > OK, then let's do this while we're thinking about it: Yup, what Dscho found was quite amusing ;-) and this obviously makes the code clearer to follow. Will queue, thanks.
Re: [PATCH v2 2/5] trace.c: export trace_setup_key
Jeff Kingwrites: > On Fri, Mar 30, 2018 at 12:46:26PM -0700, Junio C Hamano wrote: > >> Jeff King writes: >> >> > From: Nguyễn Thái Ngọc Duy >> > >> > This is so that we can print traces based on this key outside trace.c. >> >> "this key" meaning...? GIT_TRACE_SETUP? > > I think "based on trace_setup_key". > > -Peff Yeah, I read, but did not pay enough attention to, the subject X-<.
Re: [PATCH v2 2/5] trace.c: export trace_setup_key
On Fri, Mar 30, 2018 at 12:46:26PM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > From: Nguyễn Thái Ngọc Duy > > > > This is so that we can print traces based on this key outside trace.c. > > "this key" meaning...? GIT_TRACE_SETUP? I think "based on trace_setup_key". -Peff
Re: [PATCH v2 2/5] trace.c: export trace_setup_key
Jeff Kingwrites: > From: Nguyễn Thái Ngọc Duy > > This is so that we can print traces based on this key outside trace.c. "this key" meaning...? GIT_TRACE_SETUP? > > Signed-off-by: Nguyễn Thái Ngọc Duy > Signed-off-by: Jeff King > --- > trace.c | 14 +++--- > trace.h | 1 + > 2 files changed, 8 insertions(+), 7 deletions(-) > > diff --git a/trace.c b/trace.c > index 7f3b08e148..fc623e91fd 100644 > --- a/trace.c > +++ b/trace.c > @@ -26,6 +26,7 @@ > > struct trace_key trace_default_key = { "GIT_TRACE", 0, 0, 0 }; > struct trace_key trace_perf_key = TRACE_KEY_INIT(PERFORMANCE); > +struct trace_key trace_setup_key = TRACE_KEY_INIT(SETUP); > > /* Get a trace file descriptor from "key" env variable. */ > static int get_trace_fd(struct trace_key *key) > @@ -300,11 +301,10 @@ static const char *quote_crnl(const char *path) > /* FIXME: move prefix to startup_info struct and get rid of this arg */ > void trace_repo_setup(const char *prefix) > { > - static struct trace_key key = TRACE_KEY_INIT(SETUP); > const char *git_work_tree; > char *cwd; > > - if (!trace_want()) > + if (!trace_want(_setup_key)) > return; > > cwd = xgetcwd(); > @@ -315,11 +315,11 @@ void trace_repo_setup(const char *prefix) > if (!prefix) > prefix = "(null)"; > > - trace_printf_key(, "setup: git_dir: %s\n", > quote_crnl(get_git_dir())); > - trace_printf_key(, "setup: git_common_dir: %s\n", > quote_crnl(get_git_common_dir())); > - trace_printf_key(, "setup: worktree: %s\n", > quote_crnl(git_work_tree)); > - trace_printf_key(, "setup: cwd: %s\n", quote_crnl(cwd)); > - trace_printf_key(, "setup: prefix: %s\n", quote_crnl(prefix)); > + trace_printf_key(_setup_key, "setup: git_dir: %s\n", > quote_crnl(get_git_dir())); > + trace_printf_key(_setup_key, "setup: git_common_dir: %s\n", > quote_crnl(get_git_common_dir())); > + trace_printf_key(_setup_key, "setup: worktree: %s\n", > quote_crnl(git_work_tree)); > + trace_printf_key(_setup_key, "setup: cwd: %s\n", quote_crnl(cwd)); > + trace_printf_key(_setup_key, "setup: prefix: %s\n", > quote_crnl(prefix)); > > free(cwd); > } > diff --git a/trace.h b/trace.h > index 88055abef7..2b6a1bc17c 100644 > --- a/trace.h > +++ b/trace.h > @@ -15,6 +15,7 @@ extern struct trace_key trace_default_key; > > #define TRACE_KEY_INIT(name) { "GIT_TRACE_" #name, 0, 0, 0 } > extern struct trace_key trace_perf_key; > +extern struct trace_key trace_setup_key; > > extern void trace_repo_setup(const char *prefix); > extern int trace_want(struct trace_key *key);
Re: [PATCH v2 0/5] re-parenting relative directories after chdir
On Fri, Mar 30, 2018 at 02:34:25PM -0400, Jeff King wrote: > On Wed, Mar 28, 2018 at 01:36:56PM -0400, Jeff King wrote: > > > For those just joining us, this fixes a regression that was in v2.13 (so > > nothing we need to worry about as part of the 2.17-rc track). > > > > [1/4]: set_git_dir: die when setenv() fails > > [2/4]: add chdir-notify API > > [3/4]: set_work_tree: use chdir_notify > > [4/4]: refs: use chdir_notify to update cached relative paths > > Here's a re-roll based on the feedback I got, including: > > - fixes the memory leak and vague comment pointed out by Eric > > - adds in tracing code from Duy > > - I took Duy's suggestions regarding "least" surprise in some of the >functions (reparenting NULL is a noop, and registering a reparent >handler does so even for an absolute path). > > I punted on the "registering the same path twice" thing. That is a > potential way to misuse this API, but I don't think there's a good > solution. The "reparent" helper could figure this out for you, but in > the general case we actually install an arbitrary callback. So the > caller really has to handle it there. The series looks good to me. > > I think in the long run we'd want to outlaw calling set_git_dir() twice > anyway. Oh yeah. With my latest WIP changes, the bottom of setup_git_directory_gently() looks like this. Nowhere else in setup code calls these functions anymore (except the current setup_work_tree) -- 8< -- if (result.worktree) set_git_work_tree(result.worktree); if (result.gitdir) set_git_dir(result.gitdir); if (startup_info->have_repository) repo_set_hash_algo(the_repository, result.repo_fmt.hash_algo); ... return result.prefix; -- 8< -- >From here on, it's not hard to see how to turn set_git_work_tree() into setup_work_tree_gently() (without doing any set_git_dir) and the last two calls into "repo_init_gitdir(gitdir, hash_algo)", which should be where we allocate a new repository object and initialize related object store, ref store... But I still have a couple setup corner cases to deal with first :( -- Duy
[PATCH] config: move flockfile() closer to unlocked functions
On Fri, Mar 30, 2018 at 09:04:13PM +0200, Johannes Schindelin wrote: > > Probably the flockfile should go into do_config_from_file(), where we > > specify to use the unlocked variants. > > Ah, that makes sense now! I am glad I could also help ;-) :) > > Yeah, I'll wait to see how your refactor turns out. > > I don't think I'll touch too much in that part of the code. My changes > should not cause merge conflicts with a patch moving the > flockfile()/funlockfile() calls to do_config_from_file(). OK, then let's do this while we're thinking about it: -- >8 -- Subject: config: move flockfile() closer to unlocked functions Commit 260d408e32 (config: use getc_unlocked when reading from file, 2015-04-16) taught git_config_from_file() to lock the filehandle so that we could safely use the faster unlocked functions to access the handle. However, it split the logic into two places: 1. The master lock/unlock happens in git_config_from_file(). 2. The decision to use the unlocked functions happens in do_config_from_file(). That means that if anybody calls the latter function, they will accidentally use the unlocked functions without holding the lock. And indeed, git_config_from_stdin() does so. In practice, this hasn't been a problem since this code isn't generally multi-threaded (and even if some Git program happened to have another thread running, it's unlikely to be reading from stdin). But it's a good practice to make sure we're always holding the lock before using the unlocked functions. Helped-by: Johannes SchindelinSigned-off-by: Jeff King --- I wasn't sure if this was "helped by" or "reported by" or "stumbled-upon-by". :) config.c | 8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/config.c b/config.c index b0c20e6cb8..609ef2f58b 100644 --- a/config.c +++ b/config.c @@ -1426,6 +1426,7 @@ static int do_config_from_file(config_fn_t fn, void *data) { struct config_source top; + int ret; top.u.file = f; top.origin_type = origin_type; @@ -1436,7 +1437,10 @@ static int do_config_from_file(config_fn_t fn, top.do_ungetc = config_file_ungetc; top.do_ftell = config_file_ftell; - return do_config_from(, fn, data); + flockfile(f); + ret = do_config_from(, fn, data); + funlockfile(f); + return ret; } static int git_config_from_stdin(config_fn_t fn, void *data) @@ -1451,9 +1455,7 @@ int git_config_from_file(config_fn_t fn, const char *filename, void *data) f = fopen_or_warn(filename, "r"); if (f) { - flockfile(f); ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data); - funlockfile(f); fclose(f); } return ret; -- 2.17.0.rc2.594.gdb94a0ce02
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Fri, Mar 30, 2018 at 8:53 PM, Johannes Schindelinwrote: > I think the best course of action would be to incrementally do away with > the shell scripted test framework, in the way you outlined earlier this > year. This would *also* buy us a wealth of other benefits, such as better > control over the parallelization, resource usage, etc. If you have not noticed, I'm a bit busy with all sorts of stuff and probably won't continue that work. And since it affects you the most, you probably have the best motive to tackle it ;-) I don't think complaining about slow test suite helps. And avoiding adding more tests because of that definitely does not help. > It would also finally make it easier to introduce something like "smart > testing" where code coverage could be computed (this works only for C > code, of course, not for the many scripted parts of core Git), and a diff > could be inspected to discover which tests *really* need to be run, > skipping the tests that would only touch unchanged code. -- Duy
Re: A potential approach to making tests faster on Windows
On Fri, Mar 30, 2018 at 08:45:45PM +0200, Ævar Arnfjörð Bjarmason wrote: > I've wondered for a while whether it wouldn't be a viable approach to > make something like an interpreter for our test suite to get around this > problem, i.e. much of it's very repetitive and just using a few shell > functions we've defined, what if we had C equivalents of those? I've had a similar thought, though I wonder how far we could get with just shell. I even tried it out with test_cmp: https://public-inbox.org/git/20161020215647.5no7effvutwep...@sigill.intra.peff.net/ But Johannes Sixt pointed out that they already do this (see mingw_test_cmp in test-lib-functions). I also tried to explore a few numbers about process invocations to see if running shell commands is the problem: https://public-inbox.org/git/20161020123111.qnbsainul2g54...@sigill.intra.peff.net/ There was some discussion there about whether the problem is programs being exec'd, or if it's forks due to subshells. And if it is programs being exec'd, whether it's shell programs or if it is simply that we exec Git a huge number of times. -Peff
Re: [PATCH] setup.c: reset candidate->work_tree after freeing it
On Fri, Mar 30, 2018 at 8:32 PM, Junio C Hamanowrote: > Duy Nguyen writes: > >> On Fri, Mar 30, 2018 at 7:10 PM, Junio C Hamano wrote: >> >>> Which fields in candidate are safe to peek by the caller? How can a >>> caller tell? >> >> To me, all fields should be valid after >> check_repository_format_gently(). > > If so, free() is wrong in the first place, and FREE_AND_NULL() is > making it even worse, no? We learned there is work_tree set to > somewhere, the original code by Peff before abade65b ("setup: expose > enumerated repo info", 2017-11-12) freed it because the code no > longer needed that piece of information. If we are passing all we > learned back to the caller, we should not free the field in the > function at all. But it seems (below) the codepath is messier than > that. Actually no, NULL is the right value. I was trying to say that this mysterious code was about _deliberately_ ignore core.worktree. By keeping repo_fmt->worktree as NULL we tell the caller "core.worktree is not set". The current code also does that but in a different way: it sets git_work_tree_cfg based on candidate->worktree, but only for the "!has_common" block. >> We still need to free and set NULL here though in addition to a >> cleanup interface. The reason is, when checking repo config from a >> worktree, we deliberately ignore core.worktree (which belongs to the >> main repo only). The implicit line near this >> free(candidate->work_tree) is "leave git_work_tree_cfg alone, we don't >> recognize core.worktree". Once we move setting git_work_tree_cfg out >> of this function, this becomes clear. > > So in other words, there is a code that looks at the field and it > _wants_ to see NULL there---otherwise that brittle code misbehaves > and FREE_AND_NULL() is a bad-aid to work it around? > > Then proposed log message "leaving it dangling is unsanitary" is > *not* what is going on here, and the real reason why the code should > be like so deserve to be described both in the log message and in a > large in-code comment, no? Let's drop this for now. I'm a bit further along in refactoring this code that I thought I could. It'll be clearer when the caller is also updated to show what's wrong. -- Duy
Re: [PATCH] {fetch,upload}-pack: clearly mark unreachable v2 code
On Fri, Mar 30, 2018 at 12:03 PM, Brandon Williamswrote: > On 03/30, Ævar Arnfjörð Bjarmason wrote: >> Change the switch statement driving upload_pack_v2() and >> do_fetch_pack_v2() to clearly indicate that the FETCH_DONE case is >> being handled implicitly by other code, instead of giving the reader >> the impression that the "continue" statement is needed. >> >> This issue was flagged as DEADCODE by Coverity[1]. Simply removing the >> "case FETCH_DONE" would make -Wswitch warn. Instead implement the same >> solution discussed for my "[PATCH v2 18/29] grep: catch a missing enum >> in switch statement" patch[2] (which never made it into git.git). >> >> 1. >> https://public-inbox.org/git/CAGZ79kbAOcwaRzjuMtZ_HVsYvUr_7UAPbOcnrmPgsdE19q=p...@mail.gmail.com/ >> 2. https://public-inbox.org/git/20170513231509.7834-19-ava...@gmail.com/ > > I understand why you want this change, but I dislike it because it > removes the ability to have the compiler tell you that your switch > statements are exhaustive. Of course it should be noticed rather > quickly by the addition of those BUG statements :) I think coverity doesn't flag empty sections, i.e. case FETCH_DONE: default: BUG(...) would do for coverity. Not sure if we want to add a /*fall thru */ comment, that would aid other compilers to not warn about it. This would cover both the compiler as well as coverity. Stefan
Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly
Hi Peff, On Fri, 30 Mar 2018, Jeff King wrote: > On Fri, Mar 30, 2018 at 04:01:56PM +0200, Johannes Schindelin wrote: > > > You know what is *really* funny? > > > > -- snip -- > > static int git_config_from_stdin(config_fn_t fn, void *data) > > { > > return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, > > stdin, data, 0); > > } > > > > int git_config_from_file(config_fn_t fn, const char *filename, void *data) > > { > > int ret = -1; > > FILE *f; > > > > f = fopen_or_warn(filename, "r"); > > if (f) { > > flockfile(f); > > ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, > > filename, f, data, 0); > > funlockfile(f); > > fclose(f); > > } > > return ret; > > } > > -- snap -- > > > > So the _stdin variant *goes out of its way not to flockfile()*... > > *facepalm* That's probably my fault, since git_config_from_stdin() > existed already when I did the flockfile stuff. > > Probably the flockfile should go into do_config_from_file(), where we > specify to use the unlocked variants. Ah, that makes sense now! I am glad I could also help ;-) > > But I guess all this will become moot when I start handing down the config > > options. It does mean that I have to change the signatures in header > > files, oh well ;-) > > > > But then I can drop this here patch and we can stop musing about > > flockfile() ;-) > > Yeah, I'll wait to see how your refactor turns out. I don't think I'll touch too much in that part of the code. My changes should not cause merge conflicts with a patch moving the flockfile()/funlockfile() calls to do_config_from_file(). Ciao, Dscho
Re: [PATCH] {fetch,upload}-pack: clearly mark unreachable v2 code
On 03/30, Ævar Arnfjörð Bjarmason wrote: > Change the switch statement driving upload_pack_v2() and > do_fetch_pack_v2() to clearly indicate that the FETCH_DONE case is > being handled implicitly by other code, instead of giving the reader > the impression that the "continue" statement is needed. > > This issue was flagged as DEADCODE by Coverity[1]. Simply removing the > "case FETCH_DONE" would make -Wswitch warn. Instead implement the same > solution discussed for my "[PATCH v2 18/29] grep: catch a missing enum > in switch statement" patch[2] (which never made it into git.git). > > 1. > https://public-inbox.org/git/CAGZ79kbAOcwaRzjuMtZ_HVsYvUr_7UAPbOcnrmPgsdE19q=p...@mail.gmail.com/ > 2. https://public-inbox.org/git/20170513231509.7834-19-ava...@gmail.com/ I understand why you want this change, but I dislike it because it removes the ability to have the compiler tell you that your switch statements are exhaustive. Of course it should be noticed rather quickly by the addition of those BUG statements :) > > Signed-off-by: Ævar Arnfjörð Bjarmason> --- > fetch-pack.c | 4 ++-- > upload-pack.c | 4 ++-- > 2 files changed, 4 insertions(+), 4 deletions(-) > > diff --git a/fetch-pack.c b/fetch-pack.c > index 216d1368be..3a16b4bc1a 100644 > --- a/fetch-pack.c > +++ b/fetch-pack.c > @@ -1393,8 +1393,8 @@ static struct ref *do_fetch_pack_v2(struct > fetch_pack_args *args, > > state = FETCH_DONE; > break; > - case FETCH_DONE: > - continue; > + default: > + BUG("Added a new fetch_state without updating switch"); > } > } > > diff --git a/upload-pack.c b/upload-pack.c > index 87b4d32a6e..b7a7601c83 100644 > --- a/upload-pack.c > +++ b/upload-pack.c > @@ -1416,8 +1416,8 @@ int upload_pack_v2(struct repository *r, struct > argv_array *keys, > create_pack_file(); > state = FETCH_DONE; > break; > - case FETCH_DONE: > - continue; > + default: > + BUG("Added a new fetch_state without updating switch"); > } > } > > -- > 2.16.2.804.g6dcf76e118 > -- Brandon Williams
Re: [PATCH 1/9] git_config_set: fix off-by-two
Johannes Schindelinwrites: > What would be a *really* good strategy is: "Oh, there is a problem! Let's > acknowledge it and try to come up with a solution rather than a > work-around". > > EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. Yes, it is a workaround. Making shell faster on windows would of course be one possible solution to make t/t*.sh scripts go faster ;-) Or update parts of t/t*.sh so that the equivalent test coverage can be kept while running making them go faster on Windows.
Re: Bad refspec messes up bundle.
Hi Junio, On Fri, 30 Mar 2018, Junio C Hamano wrote: > Johannes Schindelinwrites: > > >> > Is this a bug? Should bundle allow providing multiple refspecs when > > ... > > I agree that it is a bug if a bundle file records a ref multiple > > times. Luciano, here are some pointers so you can fix it: > > > > - probably the best way to start would be to add a new test case to > > t/t5607-clone-bundle.sh. The script *should* be relatively easy to > > understand and imitate. The new test case would probably look somewhat > > like this: > > > > test_expect_failure 'bundles must not contain multiple refs' ' > > s/multiple/duplicate/. It is not unusual for a bundle to record > more than one ref; it is (1) useless and harmful to unsuspecting > clients to record the same ref twice with the same value and (2) > nonesnse to record the same ref twice with different value. Of course! Thanks for pointing this out. > Other than that, the outline seems to go in the right general > direction. Excellent. Luciano, the ball is in your court now. If you get stuck with anything (such as getting started with building Git), please let us know. We can help. Ciao, Johannes
Re: A potential approach to making tests faster on Windows
Ævar Arnfjörð Bjarmasonwrites: > On Fri, Mar 30 2018, Johannes Schindelin wrote [expressing frustrations > about Windows test suite slowness]: > > I've wondered for a while whether it wouldn't be a viable approach to > make something like an interpreter for our test suite to get around this > problem, i.e. much of it's very repetitive and just using a few shell > functions we've defined, what if we had C equivalents of those? > ... > > I don't have time or interest to work on this now, but thought it was > interesting to share. This assumes that something in shellscript like: > > while echo foo; do echo bar; done > > Is no slower on Windows than *nix, since it's purely using built-ins, as > opposed to something that would shell out. That's interesting; it certainly is appreciated to be constructive to find a usable solution.
Re: Draft of Git Rev News edition 37
Hi Jake, On Fri, 30 Mar 2018, Jacob Keller wrote: > On Fri, Mar 30, 2018 at 2:02 AM, Johannes Schindelin >wrote: > > > > Jake, I do not know about your availability, but I would love it if > > you could take a stab, as I trust you to be unbiased. I would not > > trust myself to be unbiased because (as everybody saw who cared to > > read) I got a little bit too emotional and should have stayed more > > professional. > > I hope to be able to make a summary of the discussion as best I can. > It may take a bit as there is a lot of mails to read. I agree that a > good summary should come from someone outside the discussion to reduce > emotional bias. Thank you so much! Dscho
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi Duy, On Fri, 30 Mar 2018, Duy Nguyen wrote: > On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelin >wrote: > > > > On Thu, 29 Mar 2018, Jeff King wrote: > > > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > >> > >> > > When calling `git config --unset abc.a` on this file, it leaves this > >> > > (invalid) config behind: > >> > > > >> > > [ > >> > > [xyz] > >> > > key = value > >> > > > >> > > The reason is that we try to search for the beginning of the line (or > >> > > for the end of the preceding section header on the same line) that > >> > > defines abc.a, but as an optimization, we subtract 2 from the offset > >> > > pointing just after the definition before we call > >> > > find_beginning_of_line(). That function, however, *also* performs that > >> > > optimization and promptly fails to find the section header correctly. > >> > > >> > This commit message would be more convincing if we had it in test form. > >> > >> I agree a test might be nice. But I don't find the commit message > >> unconvincing at all. It explains pretty clearly why the bug occurs, and > >> you can verify it by looking at find_beginning_of_line. > >> > >> > [abc]a > >> > > >> > is not written by Git, but would be written from an outside tool or > >> > person > >> > and we barely cope with it? > >> > >> Yes, I don't think git would ever write onto the same line. But clearly > >> we should handle anything that's syntactically valid. > > > > I was tempted to add the test case, because it is easy to test it. > > > > But I then decided *not* to add it. Why? Testing is a balance between "can > > do" and "need to do". > > > > Can you imagine that I did *not* run the entire test suite before > > submitting this patch series, because it takes an incredible *90 minutes* > > to run *on a fast Windows machine*? > > What's wrong with firing up a new worktree, run the test suite there > and go back to do something else so you won't waste time just waiting > for test results and submit? Sure there is a mental overhead for > switching tasks, but at 90 minutes, I think it's worth doing. Of course it is worth doing. That's why I often test the end result on Windows (waiting those 90 minutes, but I do not fire up a new worktree, I use my cloud privilege and let Azure/Visual Studio Team Services do the work for me, without slowing down my laptop). What I would love to do, however, would be to test all intermediate patches, too, as that often shows a problem with my frequent reorderings via interactive rebases. And 90 minutes times 9 is... 13 hours and 30 minutes. That's a really long time. I think the best course of action would be to incrementally do away with the shell scripted test framework, in the way you outlined earlier this year. This would *also* buy us a wealth of other benefits, such as better control over the parallelization, resource usage, etc. It would also finally make it easier to introduce something like "smart testing" where code coverage could be computed (this works only for C code, of course, not for the many scripted parts of core Git), and a diff could be inspected to discover which tests *really* need to be run, skipping the tests that would only touch unchanged code. Ciao, Dscho
Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug)
Hi Ævar, On Fri, 30 Mar 2018, Ævar Arnfjörð Bjarmason wrote: > > On Thu, Mar 29 2018, Johannes Schindelin wrote: > > > Nonetheless, I would be confortable with this patch going into > > v2.17.0, even at this late stage. The final verdict is Junio's, of > > course. > > Thanks a lot for working on this. I'm keen to stress test this, but > won't have time in the next few days, and in any case think that the > parts that change functionality should wait until after 2.17 (but e.g. > the test renaming would be fine for a cherry-pick). Obviously this was never meant to get into v2.17.0 (apart maybe from 1/9, which however is so contested over that addition of the test case under the assumption that anybody but me would dare to touch those parts of the code). Ciao, Dscho
A potential approach to making tests faster on Windows
On Fri, Mar 30 2018, Johannes Schindelin wrote [expressing frustrations about Windows test suite slowness]: I've wondered for a while whether it wouldn't be a viable approach to make something like an interpreter for our test suite to get around this problem, i.e. much of it's very repetitive and just using a few shell functions we've defined, what if we had C equivalents of those? Duy had a WIP patch set a while ago to add C test suite support, but I thought what if we turn that inside-out, and instead have a shell interpreter that knows about the likes of test_cmp, and executes them directly? Here's proof of concept as a patch to the dash shell: u dash (debian/master=) $ git diff diff --git a/src/builtins.def.in b/src/builtins.def.in index 4441fe4..b214a17 100644 --- a/src/builtins.def.in +++ b/src/builtins.def.in @@ -92,3 +92,4 @@ ulimitcmd ulimit #endif testcmdtest [ killcmd-u kill +testcmpcmd test_cmp diff --git a/src/jobs.c b/src/jobs.c index c2c2332..905563f 100644 --- a/src/jobs.c +++ b/src/jobs.c @@ -1502,3 +1502,12 @@ getstatus(struct job *job) { jobno(job), job->nprocs, status, retval)); return retval; } + +#include +int +testcmpcmd(argc, argv) + int argc; + char **argv; +{ + fprintf(stderr, "Got %d arguments\n", argc); +} I just added that to jobs.c because it was easiest, then test_cmp becomes a builtin: u dash (debian/master=) $ src/dash -c 'type test_cmp' test_cmp is a shell builtin u dash (debian/master=) $ src/dash -c 'echo foo && test_cmp 1 2 3' foo Got 4 arguments I.e. it's really easy to add new built in commands to the dash shell (and probably other shells, but dash is really small & fast). We could carry some patch like that to dash, and also patch it so test-lib.sh could know that that was our own custom shell, and we'd then skip defining functions like test_cmp, and instead use that new builtin. Similarly, it could then be linked to our own binaries, and the test-tool would be a builtin that would appropriately dispatch, and we could even eventually make "git" a shell builtin. I don't have time or interest to work on this now, but thought it was interesting to share. This assumes that something in shellscript like: while echo foo; do echo bar; done Is no slower on Windows than *nix, since it's purely using built-ins, as opposed to something that would shell out.
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi Junio, On Fri, 30 Mar 2018, Junio C Hamano wrote: > Ævar Arnfjörð Bjarmasonwrites: > > > I think if it's worth fixing it's worth testing for, a future change to > > the config code could easily introduce a regression for this, and > > particularly in this type of code obscure edge cases like this can point > > to bugs elsewhere. > > Yup. "The port to my favourite platform is too slow, and everybody > should learn to live with thin test coverage" would not be a good > strategy in the longer run. What would be a *really* good strategy is: "Oh, there is a problem! Let's acknowledge it and try to come up with a solution rather than a work-around". EXPENSIVE_ON_WINDOWS is a symptom. Not a solution. And you are actively hurting my ability to contribute, I hope you are aware of that. Ciao, Dscho
[PATCH v2 4/5] set_work_tree: use chdir_notify
When we change to the top of the working tree, we manually re-adjust $GIT_DIR and call set_git_dir() again, in order to update any relative git-dir we'd compute earlier. Instead of the work-tree code having to know to call the git-dir code, let's use the new chdir_notify interface. There are two spots that need updating, with a few subtleties in each: 1. the set_git_dir() code needs to chdir_notify_register() so it can be told when to update its path. Technically we could push this down into repo_set_gitdir(), so that even repository structs besides the_repository could benefit from this. But that opens up a lot of complications: - we'd still need to touch set_git_dir(), because it does some other setup (like setting $GIT_DIR in the environment) - submodules using other repository structs get cleaned up, which means we'd need to remove them from the chdir_notify list - it's unlikely to fix any bugs, since we shouldn't generally chdir() in the middle of working on a submodule 2. setup_work_tree now needs to call chdir_notify(), and can lose its manual set_git_dir() call. Note that at first glance it looks like this undoes the absolute-to-relative optimization added by 044bbbcb63 (Make git_dir a path relative to work_tree in setup_work_tree(), 2008-06-19). But for the most part that optimization was just _undoing_ the relative-to-absolute conversion which the function was doing earlier (and which is now gone). It is true that if you already have an absolute git_dir that the setup_work_tree() function will no longer make it relative as a side effect. But: - we generally do have relative git-dir's due to the way the discovery code works - if we really care about making git-dir's relative when possible, then we should be relativizing them earlier (e.g., when we see an absolute $GIT_DIR we could turn it relative, whether we are going to chdir into a worktree or not). That would cover all cases, including ones that 044bbbcb63 did not. Signed-off-by: Jeff King--- environment.c | 23 ++- setup.c | 9 +++-- 2 files changed, 25 insertions(+), 7 deletions(-) diff --git a/environment.c b/environment.c index e01acf8b11..903a6c9df7 100644 --- a/environment.c +++ b/environment.c @@ -13,6 +13,7 @@ #include "refs.h" #include "fmt-merge-msg.h" #include "commit.h" +#include "chdir-notify.h" int trust_executable_bit = 1; int trust_ctime = 1; @@ -296,7 +297,7 @@ char *get_graft_file(void) return the_repository->graft_file; } -void set_git_dir(const char *path) +static void set_git_dir_1(const char *path) { if (setenv(GIT_DIR_ENVIRONMENT, path, 1)) die("could not set GIT_DIR to '%s'", path); @@ -304,6 +305,26 @@ void set_git_dir(const char *path) setup_git_env(); } +static void update_relative_gitdir(const char *name, + const char *old_cwd, + const char *new_cwd, + void *data) +{ + char *path = reparent_relative_path(old_cwd, new_cwd, get_git_dir()); + trace_printf_key(_setup_key, +"setup: move $GIT_DIR to '%s'", +path); + set_git_dir_1(path); + free(path); +} + +void set_git_dir(const char *path) +{ + set_git_dir_1(path); + if (!is_absolute_path(path)) + chdir_notify_register(NULL, update_relative_gitdir, NULL); +} + const char *get_log_output_encoding(void) { return git_log_output_encoding ? git_log_output_encoding diff --git a/setup.c b/setup.c index 7287779642..9eb2e808e1 100644 --- a/setup.c +++ b/setup.c @@ -3,6 +3,7 @@ #include "config.h" #include "dir.h" #include "string-list.h" +#include "chdir-notify.h" static int inside_git_dir = -1; static int inside_work_tree = -1; @@ -378,7 +379,7 @@ int is_inside_work_tree(void) void setup_work_tree(void) { - const char *work_tree, *git_dir; + const char *work_tree; static int initialized = 0; if (initialized) @@ -388,10 +389,7 @@ void setup_work_tree(void) die(_("unable to set up work tree using invalid config")); work_tree = get_git_work_tree(); - git_dir = get_git_dir(); - if (!is_absolute_path(git_dir)) - git_dir = real_path(get_git_dir()); - if (!work_tree || chdir(work_tree)) + if (!work_tree || chdir_notify(work_tree)) die(_("this operation must be run in a work tree")); /* @@ -401,7 +399,6 @@ void setup_work_tree(void) if (getenv(GIT_WORK_TREE_ENVIRONMENT)) setenv(GIT_WORK_TREE_ENVIRONMENT, ".", 1); - set_git_dir(remove_leading_path(git_dir, work_tree)); initialized = 1; }
[PATCH v2 3/5] add chdir-notify API
If one part of the code does a permanent chdir(), then this invalidates any relative paths that may be held by other parts of the code. For example, setup_work_tree() moves us to the top of the working tree, which may invalidate a previously stored relative gitdir. We've hacked around this case by teaching setup_work_tree() to re-run set_git_dir() with an adjusted path, but this stomps all over the idea of module boundaries. setup_work_tree() shouldn't have to know all of the places that need to be fed an adjusted path. And indeed, there's at least one other place (the refs code) which needs adjusting. Let's provide an API to let code that stores relative paths "subscribe" to updates to the current working directory. This means that callers of chdir() don't need to know about all subscribers ahead of time; they can simply consult a dynamically built list. Note that our helper function to reparent relative paths uses the simple remove_leading_path(). We could in theory use the much smarter relative_path(), but that led to some problems as described in 41894ae3a3 (Use simpler relative_path when set_git_dir, 2013-10-14). Since we're aiming to replace the setup_work_tree() code here, let's follow its lead. Signed-off-by: Jeff King--- Makefile | 1 + chdir-notify.c | 93 ++ chdir-notify.h | 73 +++ 3 files changed, 167 insertions(+) create mode 100644 chdir-notify.c create mode 100644 chdir-notify.h diff --git a/Makefile b/Makefile index a1d8775adb..0da98b9274 100644 --- a/Makefile +++ b/Makefile @@ -772,6 +772,7 @@ LIB_OBJS += branch.o LIB_OBJS += bulk-checkin.o LIB_OBJS += bundle.o LIB_OBJS += cache-tree.o +LIB_OBJS += chdir-notify.o LIB_OBJS += checkout.o LIB_OBJS += color.o LIB_OBJS += column.o diff --git a/chdir-notify.c b/chdir-notify.c new file mode 100644 index 00..5f7f2c2ac2 --- /dev/null +++ b/chdir-notify.c @@ -0,0 +1,93 @@ +#include "cache.h" +#include "chdir-notify.h" +#include "list.h" +#include "strbuf.h" + +struct chdir_notify_entry { + const char *name; + chdir_notify_callback cb; + void *data; + struct list_head list; +}; +static LIST_HEAD(chdir_notify_entries); + +void chdir_notify_register(const char *name, + chdir_notify_callback cb, + void *data) +{ + struct chdir_notify_entry *e = xmalloc(sizeof(*e)); + e->name = name; + e->cb = cb; + e->data = data; + list_add_tail(>list, _notify_entries); +} + +static void reparent_cb(const char *name, + const char *old_cwd, + const char *new_cwd, + void *data) +{ + char **path = data; + char *tmp = *path; + + if (!tmp) + return; + + *path = reparent_relative_path(old_cwd, new_cwd, tmp); + free(tmp); + + if (name) { + trace_printf_key(_setup_key, +"setup: reparent %s to '%s'", +name, *path); + } +} + +void chdir_notify_reparent(const char *name, char **path) +{ + chdir_notify_register(name, reparent_cb, path); +} + +int chdir_notify(const char *new_cwd) +{ + struct strbuf old_cwd = STRBUF_INIT; + struct list_head *pos; + + if (strbuf_getcwd(_cwd) < 0) + return -1; + if (chdir(new_cwd) < 0) { + int saved_errno = errno; + strbuf_release(_cwd); + errno = saved_errno; + return -1; + } + + trace_printf_key(_setup_key, +"setup: chdir from '%s' to '%s'", +old_cwd.buf, new_cwd); + + list_for_each(pos, _notify_entries) { + struct chdir_notify_entry *e = + list_entry(pos, struct chdir_notify_entry, list); + e->cb(e->name, old_cwd.buf, new_cwd, e->data); + } + + strbuf_release(_cwd); + return 0; +} + +char *reparent_relative_path(const char *old_cwd, +const char *new_cwd, +const char *path) +{ + char *ret, *full; + + if (is_absolute_path(path)) + return xstrdup(path); + + full = xstrfmt("%s/%s", old_cwd, path); + ret = xstrdup(remove_leading_path(full, new_cwd)); + free(full); + + return ret; +} diff --git a/chdir-notify.h b/chdir-notify.h new file mode 100644 index 00..366e4c1ee9 --- /dev/null +++ b/chdir-notify.h @@ -0,0 +1,73 @@ +#ifndef CHDIR_NOTIFY_H +#define CHDIR_NOTIFY_H + +/* + * An API to let code "subscribe" to changes to the current working directory. + * The general idea is that some code asks to be notified when the working + * directory changes, and other code that calls chdir uses a special wrapper + * that notifies everyone. + */ + +/* + * Callers who need to know about changes
[PATCH v2 5/5] refs: use chdir_notify to update cached relative paths
Commit f57f37e2e1 (files-backend: remove the use of git_path(), 2017-03-26) introduced a regression when a relative $GIT_DIR is used in a working tree: - when we initialize the ref backend, we make a copy of get_git_dir(), which may be relative - later, we may call setup_work_tree() and chdir to the root of the working tree - further calls to the ref code will use the stored git directory, but relative paths will now point to the wrong place The new test in t1501 demonstrates one such instance (the bug causes us to write the ref update to the nonsense "relative/relative/.git"). Since setup_work_tree() now uses chdir_notify, we can just ask it update our relative paths when necessary. Reported-by: Rafael AscensaoHelped-by: Nguyễn Thái Ngọc Duy Signed-off-by: Jeff King --- refs/files-backend.c | 6 ++ refs/packed-backend.c | 3 +++ t/t1501-work-tree.sh | 12 3 files changed, 21 insertions(+) diff --git a/refs/files-backend.c b/refs/files-backend.c index bec8e30e9e..a92a2aa821 100644 --- a/refs/files-backend.c +++ b/refs/files-backend.c @@ -9,6 +9,7 @@ #include "../lockfile.h" #include "../object.h" #include "../dir.h" +#include "../chdir-notify.h" /* * This backend uses the following flags in `ref_update::flags` for @@ -106,6 +107,11 @@ static struct ref_store *files_ref_store_create(const char *gitdir, refs->packed_ref_store = packed_ref_store_create(sb.buf, flags); strbuf_release(); + chdir_notify_reparent("files-backend $GIT_DIR", + >gitdir); + chdir_notify_reparent("files-backend $GIT_COMMONDIR", + >gitcommondir); + return ref_store; } diff --git a/refs/packed-backend.c b/refs/packed-backend.c index 65288c6472..369c34f886 100644 --- a/refs/packed-backend.c +++ b/refs/packed-backend.c @@ -5,6 +5,7 @@ #include "packed-backend.h" #include "../iterator.h" #include "../lockfile.h" +#include "../chdir-notify.h" enum mmap_strategy { /* @@ -202,6 +203,8 @@ struct ref_store *packed_ref_store_create(const char *path, refs->store_flags = store_flags; refs->path = xstrdup(path); + chdir_notify_reparent("packed-refs", >path); + return ref_store; } diff --git a/t/t1501-work-tree.sh b/t/t1501-work-tree.sh index b06210ec5e..a5db53337b 100755 --- a/t/t1501-work-tree.sh +++ b/t/t1501-work-tree.sh @@ -431,4 +431,16 @@ test_expect_success 'error out gracefully on invalid $GIT_WORK_TREE' ' ) ' +test_expect_success 'refs work with relative gitdir and work tree' ' + git init relative && + git -C relative commit --allow-empty -m one && + git -C relative commit --allow-empty -m two && + + GIT_DIR=relative/.git GIT_WORK_TREE=relative git reset HEAD^ && + + git -C relative log -1 --format=%s >actual && + echo one >expect && + test_cmp expect actual +' + test_done -- 2.17.0.rc2.594.gdb94a0ce02
[PATCH v2 2/5] trace.c: export trace_setup_key
From: Nguyễn Thái Ngọc DuyThis is so that we can print traces based on this key outside trace.c. Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Jeff King --- trace.c | 14 +++--- trace.h | 1 + 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/trace.c b/trace.c index 7f3b08e148..fc623e91fd 100644 --- a/trace.c +++ b/trace.c @@ -26,6 +26,7 @@ struct trace_key trace_default_key = { "GIT_TRACE", 0, 0, 0 }; struct trace_key trace_perf_key = TRACE_KEY_INIT(PERFORMANCE); +struct trace_key trace_setup_key = TRACE_KEY_INIT(SETUP); /* Get a trace file descriptor from "key" env variable. */ static int get_trace_fd(struct trace_key *key) @@ -300,11 +301,10 @@ static const char *quote_crnl(const char *path) /* FIXME: move prefix to startup_info struct and get rid of this arg */ void trace_repo_setup(const char *prefix) { - static struct trace_key key = TRACE_KEY_INIT(SETUP); const char *git_work_tree; char *cwd; - if (!trace_want()) + if (!trace_want(_setup_key)) return; cwd = xgetcwd(); @@ -315,11 +315,11 @@ void trace_repo_setup(const char *prefix) if (!prefix) prefix = "(null)"; - trace_printf_key(, "setup: git_dir: %s\n", quote_crnl(get_git_dir())); - trace_printf_key(, "setup: git_common_dir: %s\n", quote_crnl(get_git_common_dir())); - trace_printf_key(, "setup: worktree: %s\n", quote_crnl(git_work_tree)); - trace_printf_key(, "setup: cwd: %s\n", quote_crnl(cwd)); - trace_printf_key(, "setup: prefix: %s\n", quote_crnl(prefix)); + trace_printf_key(_setup_key, "setup: git_dir: %s\n", quote_crnl(get_git_dir())); + trace_printf_key(_setup_key, "setup: git_common_dir: %s\n", quote_crnl(get_git_common_dir())); + trace_printf_key(_setup_key, "setup: worktree: %s\n", quote_crnl(git_work_tree)); + trace_printf_key(_setup_key, "setup: cwd: %s\n", quote_crnl(cwd)); + trace_printf_key(_setup_key, "setup: prefix: %s\n", quote_crnl(prefix)); free(cwd); } diff --git a/trace.h b/trace.h index 88055abef7..2b6a1bc17c 100644 --- a/trace.h +++ b/trace.h @@ -15,6 +15,7 @@ extern struct trace_key trace_default_key; #define TRACE_KEY_INIT(name) { "GIT_TRACE_" #name, 0, 0, 0 } extern struct trace_key trace_perf_key; +extern struct trace_key trace_setup_key; extern void trace_repo_setup(const char *prefix); extern int trace_want(struct trace_key *key); -- 2.17.0.rc2.594.gdb94a0ce02
[PATCH v2 0/5] re-parenting relative directories after chdir
On Wed, Mar 28, 2018 at 01:36:56PM -0400, Jeff King wrote: > For those just joining us, this fixes a regression that was in v2.13 (so > nothing we need to worry about as part of the 2.17-rc track). > > [1/4]: set_git_dir: die when setenv() fails > [2/4]: add chdir-notify API > [3/4]: set_work_tree: use chdir_notify > [4/4]: refs: use chdir_notify to update cached relative paths Here's a re-roll based on the feedback I got, including: - fixes the memory leak and vague comment pointed out by Eric - adds in tracing code from Duy - I took Duy's suggestions regarding "least" surprise in some of the functions (reparenting NULL is a noop, and registering a reparent handler does so even for an absolute path). I punted on the "registering the same path twice" thing. That is a potential way to misuse this API, but I don't think there's a good solution. The "reparent" helper could figure this out for you, but in the general case we actually install an arbitrary callback. So the caller really has to handle it there. I think in the long run we'd want to outlaw calling set_git_dir() twice anyway. But possibly the handler-installers should be more careful. I'm dropping poor Rafael from the cc here, since it is turning into quite a lot of messages. :) I think at this point his bug has been duly reported and we are working on the fix. [1/5]: set_git_dir: die when setenv() fails [2/5]: trace.c: export trace_setup_key [3/5]: add chdir-notify API [4/5]: set_work_tree: use chdir_notify [5/5]: refs: use chdir_notify to update cached relative paths Makefile | 1 + cache.h | 2 +- chdir-notify.c| 93 +++ chdir-notify.h| 73 + environment.c | 26 ++-- refs/files-backend.c | 6 +++ refs/packed-backend.c | 3 ++ setup.c | 9 ++--- t/t1501-work-tree.sh | 12 ++ trace.c | 14 +++ trace.h | 1 + 11 files changed, 223 insertions(+), 17 deletions(-) create mode 100644 chdir-notify.c create mode 100644 chdir-notify.h -Peff
[PATCH v2 1/5] set_git_dir: die when setenv() fails
The set_git_dir() function returns an error if setenv() fails, but there are zero callers who pay attention to this return value. If this ever were to happen, it could cause confusing results, as sub-processes would see a potentially stale GIT_DIR (e.g., if it is relative and we chdir()-ed to the root of the working tree). We _could_ try to fix each caller, but there's really nothing useful to do after this failure except die. Let's just lump setenv() failure into the same category as malloc failure: things that should never happen and cause us to abort catastrophically. Signed-off-by: Jeff King--- cache.h | 2 +- environment.c | 5 ++--- 2 files changed, 3 insertions(+), 4 deletions(-) diff --git a/cache.h b/cache.h index a61b2d3f0d..5c24394d84 100644 --- a/cache.h +++ b/cache.h @@ -477,7 +477,7 @@ extern const char *get_git_common_dir(void); extern char *get_object_directory(void); extern char *get_index_file(void); extern char *get_graft_file(void); -extern int set_git_dir(const char *path); +extern void set_git_dir(const char *path); extern int get_common_dir_noenv(struct strbuf *sb, const char *gitdir); extern int get_common_dir(struct strbuf *sb, const char *gitdir); extern const char *get_git_namespace(void); diff --git a/environment.c b/environment.c index d6dd64662c..e01acf8b11 100644 --- a/environment.c +++ b/environment.c @@ -296,13 +296,12 @@ char *get_graft_file(void) return the_repository->graft_file; } -int set_git_dir(const char *path) +void set_git_dir(const char *path) { if (setenv(GIT_DIR_ENVIRONMENT, path, 1)) - return error("Could not set GIT_DIR to '%s'", path); + die("could not set GIT_DIR to '%s'", path); repo_set_gitdir(the_repository, path); setup_git_env(); - return 0; } const char *get_log_output_encoding(void) -- 2.17.0.rc2.594.gdb94a0ce02
Re: [PATCH] setup.c: reset candidate->work_tree after freeing it
Duy Nguyenwrites: > On Fri, Mar 30, 2018 at 7:10 PM, Junio C Hamano wrote: > >> Which fields in candidate are safe to peek by the caller? How can a >> caller tell? > > To me, all fields should be valid after > check_repository_format_gently(). If so, free() is wrong in the first place, and FREE_AND_NULL() is making it even worse, no? We learned there is work_tree set to somewhere, the original code by Peff before abade65b ("setup: expose enumerated repo info", 2017-11-12) freed it because the code no longer needed that piece of information. If we are passing all we learned back to the caller, we should not free the field in the function at all. But it seems (below) the codepath is messier than that. > We still need to free and set NULL here though in addition to a > cleanup interface. The reason is, when checking repo config from a > worktree, we deliberately ignore core.worktree (which belongs to the > main repo only). The implicit line near this > free(candidate->work_tree) is "leave git_work_tree_cfg alone, we don't > recognize core.worktree". Once we move setting git_work_tree_cfg out > of this function, this becomes clear. So in other words, there is a code that looks at the field and it _wants_ to see NULL there---otherwise that brittle code misbehaves and FREE_AND_NULL() is a bad-aid to work it around? Then proposed log message "leaving it dangling is unsanitary" is *not* what is going on here, and the real reason why the code should be like so deserve to be described both in the log message and in a large in-code comment, no?
Re: [PATCH] builtin/config.c: prefer `--type=bool` over `--bool`, etc.
On Fri, Mar 30, 2018 at 09:00:05AM -0700, Junio C Hamano wrote: > Jeff Kingwrites: > > > ... But actually, last-one-wins applies only > > to a _single_ option, not necessarily unrelated ones. Many other > > multi-action commands actually have a series of separate boolean flags, > > and then complain when more than one of the flags is set. > > > > So maybe it's not such a good idea for the actions (I do still think > > it's the right path for the types). > > If this were using command verbs (e.g. "git config get foo.bar") as > opposed to command options (e.g. "git config --get foo.bar"), it > wouldn't ahve allowed multiple command verbs from the command line, > and last-one-wins would not have made much sense because there is no > way to trigger it as a desirable "feature". > > Just like the topic of the discussion unifies --int/--bool/etc. into > a single --type={int,bool,...}, perhaps the existing command options > --get/--list/etc. can be taken as if they were a mistaken historical > way to spell --action={get,list,...}. I of course am not recommending > to add a new "--action" option. I am suggesting it as a thought-aid > to see if actions are all that different from value type options. > > I agree that a-bit-per-type that is checked with HAS_MULTI_BITS() > for error at the end does not make much sense. I also think what > you did in this patch for actions is a good clean-up for the above > reason. I agree the code internally is nicer after the patch. If we throw away the argument of "last-one-wins is more consistent with other parts of git" (because I don't really think that it is for this type of option), I could possibly buy this as a code cleanup. But it does have a user-visible impact, which makes the question: are we _OK_ with switching to last-one-wins? -Peff
Re: [PATCH v3 3/3] builtin/config: introduce `color` type specifier
Taylor Blauwrites: > @@ -184,6 +183,7 @@ Valid `[type]`'s include: > --bool-or-int:: > --path:: > --expiry-date:: > +--color:: >Historical options for selecting a type specifier. Prefer instead `--type`, >(see: above). > > @@ -223,6 +223,9 @@ Valid `[type]`'s include: > output it as the ANSI color escape sequence to the standard > output. The optional `default` parameter is used instead, if > there is no color configured for `name`. > ++ > +It is preferred to use `--type=color`, or `--type=color --default=[default]` > +instead of `--get-color`. Wasn't the whole point of the preliminary --type= patch to avoid having to add thse two hunks?
Re: [PATCH v3 1/3] builtin/config: introduce `--default`
Taylor Blauwrites: > For some use cases, callers of the `git-config(1)` builtin would like to > fallback to default values when the slot asked for does not exist. In > addition, users would like to use existing type specifiers to ensure > that values are parsed correctly when they do exist in the > configuration. > ... > +--default value:: > + When using `--get`, and the requested slot is not found, behave as if value > + were the value assigned to the that slot. For "diff..color", the above is OK, but in general, configuration variables are not called "slot". s/slot/variable/.
Re: [PATCH] setup.c: reset candidate->work_tree after freeing it
On Fri, Mar 30, 2018 at 7:10 PM, Junio C Hamanowrote: > Nguyễn Thái Ngọc Duy writes: > >> Dangling pointers are usually bad news. Reset it back to NULL. >> >> Signed-off-by: Nguyễn Thái Ngọc Duy >> --- > > Before abade65b ("setup: expose enumerated repo info", 2017-11-12), > candidate was an on-stack variable local to this function, so there > was no need to do anything before returning. After that commit, we > pass _fmt down the codepath so theoretically the caller could > peek into repo_fmt.work_tree after this codepath, which may be bad. > But if candidate->work_tree were not NULL at this point, that would > mean that such a caller that peeks is getting a WRONG information, > no? It thinks there were no core.worktree set but in reality there > was the configuration set in the repository, no? > > Which fields in candidate are safe to peek by the caller? How can a > caller tell? To me, all fields should be valid after check_repository_format_gently(). Right now the caller does not need to read any info from repo_fmt because check_repo... passes the information in another way, via global variables like is_bare_repository_cfg and git_work_tree_cfg. > It seems that setup_git_directory_gently() uses repo_fmt when > calling various variants of setup_*_git_dir() and then uses the > repo_fmt.hash_algo field later. Yes. Though if we're going to reduce global state further more, then the "if (!has_common)" should be done by the caller, then we need access to all fields in repo_fmt > If we want to keep fields of repo_fmt alive and valid after > check_repository_format_gently() and callers of it like > setup_*_git_dir() returns, then perhaps the right fix is not to free > candidate->work_tree here, and instead give an interface to clean up > repository_format instance, so that the ultimate caller like > setup_git_directory_gently() can safely peek into any fields in it > and then clean it up after it is done? We still need to free and set NULL here though in addition to a cleanup interface. The reason is, when checking repo config from a worktree, we deliberately ignore core.worktree (which belongs to the main repo only). The implicit line near this free(candidate->work_tree) is "leave git_work_tree_cfg alone, we don't recognize core.worktree". Once we move setting git_work_tree_cfg out of this function, this becomes clear. I didn't think all of this when I wrote this patch though. It was "hey, it's theoretical bug so let's fix it". Only later on when I refactored more that I realized what it meant. -- Duy
Re: [PATCH 3/4] set_work_tree: use chdir_notify
On Thu, Mar 29, 2018 at 08:01:33PM +0200, Duy Nguyen wrote: > > I do kind of like that. I'm reasonably happy with the chdir_notify() > > interface, but it would be nicer still if we could get rid of it in the > > first place. It's true that we _could_ chdir from other places, but > > There's another place we do, that I should mention and keep > forgetting. Our run-command.c code allows to switch cwd, and if > $GIT_DIR and stuff is relative then we should reparent them too just > like we do with setup_work_tree(). Your chdir-notify makes it super > easy to support that, we just need to move the prep_childenv() down > below chdir(). But since nobody has complaint, I suppose that feature > is not really popular (or at least not used to launch another git > process anyway) Yeah, I hadn't really considered that, since it would only matter for environment variables, not for internal strings (which are all about to get thrown out due to execve anyway). And my patch was just focusing on that. But I also wonder if rewriting the environment variables would matter here. If we're going to chdir for a child, we'd generaly be clearing repo-related env anyway. -Peff
Re: [PATCH 0/8] Re: git complains packed-refs is not a directory when used with GIT_DIR and GIT_WORK_TREE envvars.
On Thu, Mar 29, 2018 at 04:57:26PM +0200, Duy Nguyen wrote: > On Wed, Mar 28, 2018 at 02:19:32PM -0400, Jeff King wrote: > > > > > I will probably rework on top of your chdir-notify instead (and let > > > yours to be merged earlier) > > > > Thanks. I like some of the related changes you made, like including this > > in the tracing output. That should be easy to do on top of mine, I > > think. > > Yeah. But is it possible to sneak something like this in your series > (I assume you will reroll anyway)? I could do it separately, but it > looks nicer if it's split out and merged in individual patches that > add new chdir-notify call site. Sure. > -void chdir_notify_register(chdir_notify_callback cb, void *data) > +void chdir_notify_register(const char *name, > +chdir_notify_callback cb, > +void *data) > { > struct chdir_notify_entry *e = xmalloc(sizeof(*e)); > e->cb = cb; > e->data = data; > + e->name = name; > list_add_tail(>list, _notify_entries); > } I'm tempted to make a copy of the name here (or at least document that it must remain valid forever). -Peff
Re: Bad refspec messes up bundle.
Johannes Schindelinwrites: > Hi Luciano, > >> > Is this a bug? Should bundle allow providing multiple refspecs when > ... > I agree that it is a bug if a bundle file records a ref multiple times. > Luciano, here are some pointers so you can fix it: > > - probably the best way to start would be to add a new test case to > t/t5607-clone-bundle.sh. The script *should* be relatively easy to > understand and imitate. The new test case would probably look somewhat > like this: > > test_expect_failure 'bundles must not contain multiple refs' ' s/multiple/duplicate/. It is not unusual for a bundle to record more than one ref; it is (1) useless and harmful to unsuspecting clients to record the same ref twice with the same value and (2) nonesnse to record the same ref twice with different value. Other than that, the outline seems to go in the right general direction.
Re: [PATCH] setup.c: reset candidate->work_tree after freeing it
Nguyễn Thái Ngọc Duywrites: > Dangling pointers are usually bad news. Reset it back to NULL. > > Signed-off-by: Nguyễn Thái Ngọc Duy > --- Before abade65b ("setup: expose enumerated repo info", 2017-11-12), candidate was an on-stack variable local to this function, so there was no need to do anything before returning. After that commit, we pass _fmt down the codepath so theoretically the caller could peek into repo_fmt.work_tree after this codepath, which may be bad. But if candidate->work_tree were not NULL at this point, that would mean that such a caller that peeks is getting a WRONG information, no? It thinks there were no core.worktree set but in reality there was the configuration set in the repository, no? Which fields in candidate are safe to peek by the caller? How can a caller tell? It seems that setup_git_directory_gently() uses repo_fmt when calling various variants of setup_*_git_dir() and then uses the repo_fmt.hash_algo field later. If we want to keep fields of repo_fmt alive and valid after check_repository_format_gently() and callers of it like setup_*_git_dir() returns, then perhaps the right fix is not to free candidate->work_tree here, and instead give an interface to clean up repository_format instance, so that the ultimate caller like setup_git_directory_gently() can safely peek into any fields in it and then clean it up after it is done? > setup.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/setup.c b/setup.c > index 7287779642..d193bee192 100644 > --- a/setup.c > +++ b/setup.c > @@ -482,7 +482,7 @@ static int check_repository_format_gently(const char > *gitdir, struct repository_ > inside_work_tree = -1; > } > } else { > - free(candidate->work_tree); > + FREE_AND_NULL(candidate->work_tree); > } > > return 0;
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Fri, Mar 30, 2018 at 2:32 PM, Johannes Schindelinwrote: > Hi, > > On Thu, 29 Mar 2018, Jeff King wrote: > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: >> >> > > When calling `git config --unset abc.a` on this file, it leaves this >> > > (invalid) config behind: >> > > >> > > [ >> > > [xyz] >> > > key = value >> > > >> > > The reason is that we try to search for the beginning of the line (or >> > > for the end of the preceding section header on the same line) that >> > > defines abc.a, but as an optimization, we subtract 2 from the offset >> > > pointing just after the definition before we call >> > > find_beginning_of_line(). That function, however, *also* performs that >> > > optimization and promptly fails to find the section header correctly. >> > >> > This commit message would be more convincing if we had it in test form. >> >> I agree a test might be nice. But I don't find the commit message >> unconvincing at all. It explains pretty clearly why the bug occurs, and >> you can verify it by looking at find_beginning_of_line. >> >> > [abc]a >> > >> > is not written by Git, but would be written from an outside tool or person >> > and we barely cope with it? >> >> Yes, I don't think git would ever write onto the same line. But clearly >> we should handle anything that's syntactically valid. > > I was tempted to add the test case, because it is easy to test it. > > But I then decided *not* to add it. Why? Testing is a balance between "can > do" and "need to do". > > Can you imagine that I did *not* run the entire test suite before > submitting this patch series, because it takes an incredible *90 minutes* > to run *on a fast Windows machine*? What's wrong with firing up a new worktree, run the test suite there and go back to do something else so you won't waste time just waiting for test results and submit? Sure there is a mental overhead for switching tasks, but at 90 minutes, I think it's worth doing. -- Duy
Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)
Hi Johannes, Johannes Schindelinwrites: > Hi Sergey, > > On Wed, 28 Mar 2018, Sergey Organov wrote: > >> Johannes Schindelin writes: >> >> > I use rebase every day. I use the Git garden shears every week. If you >> > do not trust my experience with these things, nothing will convince >> > you. >> >> Unfortunately you have exactly zero experience with rebasing merges as >> you've never actually rebased them till now, and it's rebasing merges >> that matters in this particular discussion. > > Who says that I have 0 experience with that? Oh yes, you do. Like, as if > you know. I just didn't see even single symptom of it in the discussion, still I said nothing about it until you started to use your presumed experience in place of true arguments. > Guess what I do with those Git garden shears' merges? Can you guess? Of > course you can. But you'll never know until I tell you. It is a little > silly to try to tell me that I do not have any experience with rebasing > merges when you have no idea what strategies I tried in the past. Please notice that I never even started to discuss your 'merge' directive, exactly because I believe you have huge experience both implementing and using it that could be relied upon. Just don't mix-in rebasing merge commits into it, as that is fundamentally different operation. And the other unspoken strategies you tried are irrelevant here, as you declined the whole idea of replaying merge-the-commit instead of replaying merge-the-operation until recently, and it seems you still do, at least to some level, by attempting to use 'merge' operation to replay "the (merge) _commit_". I'm afraid that whatever you've tried in the past likely suffered from the same conceptual confusion, and thus did not work indeed. That said, negative experience is still an experience, often helpful, but what is relevant here is that it likely was not about rebasing merge commits at all, so trying to use this irrelevant experience in the discussion to support your arguments, or rather lack of them, seems even more unfair to me then using relevant experience for that purpose. > Now, Phillip's strategy is clearly the best strategy I ever heard > about, For 'pick' vs 'merge', it's not about strategy, it's about concept. It looks like you still believe that you somehow "merge" a merge commit when you actually rebase it (with Phillip's strategy if you wish). You don't merge it anywhere, period. > and I am in the process of doing Actual Work to Put It To The Test. That's the best outcome of the discussion I ever hoped for, seriously, and I'll be all ears to listen to the outcomes of the experience you will gain with it. BTW, when you have it working, use the strategy you've implemented for non-merge commits as well, as a good one should still work fine. Moreover, it should better bring exactly the same results as the default existing strategy being used for non-merge commits, even in conflicting situations. Hopefully /that/ experience will finally push you strong enough to get the concept right, and you will finally understand that what you've implemented is nothing else but a _cherry-pick_ of a _merge commit_, that reads simply _pick_, for brevity. -- Sergey
Re: [PATCH 1/9] git_config_set: fix off-by-two
Ævar Arnfjörð Bjarmasonwrites: > I think if it's worth fixing it's worth testing for, a future change to > the config code could easily introduce a regression for this, and > particularly in this type of code obscure edge cases like this can point > to bugs elsewhere. Yup. "The port to my favourite platform is too slow, and everybody should learn to live with thin test coverage" would not be a good strategy in the longer run.
Re: [PATCH v4 3/5] stash: convert drop and clear to builtin
Joel Teichroebwrites: > +static int do_clear_stash(void) > +{ > + struct object_id obj; > + if (get_oid(ref_stash, )) > + return 0; > + return delete_ref(NULL, ref_stash, , 0); > +} Here you see if the "refs/stash" is there, and learn what its current value is, using get_oid(), so that you can call delete_ref() with it. > +static int do_drop_stash(const char *prefix, struct stash_info *info) > +{ > + ... > + cp.git_cmd = 1; > + /* Even though --quiet is specified, rev-parse still outputs the hash */ > + cp.no_stdout = 1; > + argv_array_pushl(, "rev-parse", "--verify", "--quiet", NULL); > + argv_array_pushf(, "%s@{0}", ref_stash); > + ret = run_command(); > + if (ret) > + do_clear_stash(); Here you call out to rev-parse as an external process. Isn't doing the same get_oid() sufficient? Not limited to the above examples, the conversion in this series feels somewhat unbalanced---doing easy things like this by forking an external process and then doing a relatively heavyweight thing like merge operation (in 2/5) in-process feels the other way around. Thanks.
Re: Draft of Git Rev News edition 37
On Fri, Mar 30, 2018 at 2:02 AM, Johannes Schindelinwrote: > Hi, > > On Mon, 19 Mar 2018, Christian Couder wrote: > >> On Sun, Mar 18, 2018 at 11:41 PM, Jacob Keller >> wrote: >> > >> > I don't have a good summary yet, but I think a section about the >> > discussion regarding the new recreate-merges and rebasing merges >> > that's been on going might be useful? >> >> Yeah sure, we would gladly accept a summary of this discussion. > > I would *love* a summary of that discussion, especially since it got > pretty unwieldy (and partially out of hand, but that part probably does > not need a lot of detail apart from the adjective "heated"). > >> > a lot of that discussion occurred prior to git-merge (tho it's been >> > ongoing since then?). >> >> If you want to take the latest discussions into account, the summary >> could be either split into two parts, one for this edition and the >> other one for the next edition. Or we could wait and have the whole >> summary in the next edition. > > Jake, I do not know about your availability, but I would love it if you > could take a stab, as I trust you to be unbiased. I would not trust myself > to be unbiased because (as everybody saw who cared to read) I got a little > bit too emotional and should have stayed more professional. > > Thanks, > Dscho I hope to be able to make a summary of the discussion as best I can. It may take a bit as there is a lot of mails to read. I agree that a good summary should come from someone outside the discussion to reduce emotional bias. Thanks, Jake
Re: [PATCH] builtin/config.c: prefer `--type=bool` over `--bool`, etc.
Jeff Kingwrites: > ... But actually, last-one-wins applies only > to a _single_ option, not necessarily unrelated ones. Many other > multi-action commands actually have a series of separate boolean flags, > and then complain when more than one of the flags is set. > > So maybe it's not such a good idea for the actions (I do still think > it's the right path for the types). If this were using command verbs (e.g. "git config get foo.bar") as opposed to command options (e.g. "git config --get foo.bar"), it wouldn't ahve allowed multiple command verbs from the command line, and last-one-wins would not have made much sense because there is no way to trigger it as a desirable "feature". Just like the topic of the discussion unifies --int/--bool/etc. into a single --type={int,bool,...}, perhaps the existing command options --get/--list/etc. can be taken as if they were a mistaken historical way to spell --action={get,list,...}. I of course am not recommending to add a new "--action" option. I am suggesting it as a thought-aid to see if actions are all that different from value type options. I agree that a-bit-per-type that is checked with HAS_MULTI_BITS() for error at the end does not make much sense. I also think what you did in this patch for actions is a good clean-up for the above reason.
Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)
Hi Johannes, Johannes Schindelinwrites: > Hi Sergey, > > On Fri, 30 Mar 2018, Sergey Organov wrote: > >> Could we please agree to stop using backward compatibility as an >> objection in the discussion of the --recreate-merges feature? > > No. > > The expectation of users as to what a `pick` is has not changed just > because you wish it would. As if I ever suggested to change user expectations. Could you please stop putting words into my mouth? I _am_ a user, and I expect 'pick' to pick commits, no matter how many parents they might have. And no, --preserve-merges did not ever pick commits with number of parents more than one, it rather threw them away and re-merged the heads. Calling it 'pick' was a huge mistake indeed! Fixing that mistake is what I expect, as a user. Just teach the 'pick' to correctly pick any commit, please! > > That is a matter of backwards-compatibility. OK, fine, at least its only about user expectations and not about some scripting incompatibility. > You see, if you are driving a car for a hundred years already, and then > switch to a different car, and it has a lever in the same place as your > previous car's windshield wiper, but in the new car it has a button that > activates the emergency driver seat ejection OMG *it has a seat ejection > like in the James Bond movies! Where can I get that car?* Sorry for > disgressing. Except it's irrelevant as the 'pick' will still pick commits. > I am really concerned about that willingness to put an innocuous button, > so to speak, onto something users got really used to, over the course of a > decade or so, when that button should really be made red and blinking and > OMG where can I get that car? It's irrelevant as the 'pick' will still pick commits. > So to reiterate, I am really interested in a practical solution that won't > cause nasty surprises. I rather don't see how it possibly could cause any surprises, especially compared to using 'merge' to pick commits. > Meaning: `pick` != merge. Exactly! Use 'merge' when you merge, as you are already doing. Use 'pick' when you are picking. You don't merge "merge commit" when you are picking it! > That was a mistake in preserve-merges, as I have only mentioned like a > hundred times, and we won't repeat it. The mistake was that it used 'pick' to denote re-merge. You already fixed that mistake by introducing 'merge' to re-merge, thanks God. Please don't commit yet another mistake by now using 'merge' to pick! -- Sergey
Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug)
On Thu, Mar 29 2018, Johannes Schindelin wrote: > Nonetheless, I would be confortable with this patch going into v2.17.0, even > at > this late stage. The final verdict is Junio's, of course. Thanks a lot for working on this. I'm keen to stress test this, but won't have time in the next few days, and in any case think that the parts that change functionality should wait until after 2.17 (but e.g. the test renaming would be fine for a cherry-pick).
Re: [PATCH 1/9] git_config_set: fix off-by-two
On Fri, Mar 30 2018, Johannes Schindelin wrote: > On Thu, 29 Mar 2018, Jeff King wrote: > >> On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: >> >> > > When calling `git config --unset abc.a` on this file, it leaves this >> > > (invalid) config behind: >> > > >> > > [ >> > > [xyz] >> > > key = value >> > > >> > > The reason is that we try to search for the beginning of the line (or >> > > for the end of the preceding section header on the same line) that >> > > defines abc.a, but as an optimization, we subtract 2 from the offset >> > > pointing just after the definition before we call >> > > find_beginning_of_line(). That function, however, *also* performs that >> > > optimization and promptly fails to find the section header correctly. >> > >> > This commit message would be more convincing if we had it in test form. >> >> I agree a test might be nice. But I don't find the commit message >> unconvincing at all. It explains pretty clearly why the bug occurs, and >> you can verify it by looking at find_beginning_of_line. >> >> > [abc]a >> > >> > is not written by Git, but would be written from an outside tool or person >> > and we barely cope with it? >> >> Yes, I don't think git would ever write onto the same line. But clearly >> we should handle anything that's syntactically valid. > > I was tempted to add the test case, because it is easy to test it. > > But I then decided *not* to add it. Why? Testing is a balance between "can > do" and "need to do". > > Can you imagine that I did *not* run the entire test suite before > submitting this patch series, because it takes an incredible *90 minutes* > to run *on a fast Windows machine*? I think if it's worth fixing it's worth testing for, a future change to the config code could easily introduce a regression for this, and particularly in this type of code obscure edge cases like this can point to bugs elsewhere. We have the EXPENSIVE_ON_WINDOWS prerequisite already in master from an earlier series of mine, maybe we could use that here, or add some other prereq like OVERLY_EXHAUSTIVE which by default could depend on EXPENSIVE_ON_WINDOWS, i.e. we'd have a set of overly pedantic tests that we skip on Windows by default, as there's no reason to suspect they're platform-dependent, but we'd like to know if they regress.
Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly
On Fri, Mar 30, 2018 at 04:01:56PM +0200, Johannes Schindelin wrote: > You know what is *really* funny? > > -- snip -- > static int git_config_from_stdin(config_fn_t fn, void *data) > { > return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, > data, 0); > } > > int git_config_from_file(config_fn_t fn, const char *filename, void *data) > { > int ret = -1; > FILE *f; > > f = fopen_or_warn(filename, "r"); > if (f) { > flockfile(f); > ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, > filename, f, data, 0); > funlockfile(f); > fclose(f); > } > return ret; > } > -- snap -- > > So the _stdin variant *goes out of its way not to flockfile()*... *facepalm* That's probably my fault, since git_config_from_stdin() existed already when I did the flockfile stuff. Probably the flockfile should go into do_config_from_file(), where we specify to use the unlocked variants. > But I guess all this will become moot when I start handing down the config > options. It does mean that I have to change the signatures in header > files, oh well ;-) > > But then I can drop this here patch and we can stop musing about > flockfile() ;-) Yeah, I'll wait to see how your refactor turns out. -Peff
Re: [PATCH v5 0/6] worktree: teach "add" to check out existing branches
On 03/27, Eric Sunshine wrote: > On Sun, Mar 25, 2018 at 9:49 AM, Thomas Gummererwrote: > > Thanks Eric for the review of the previous round and Duy and Junio for > > additional comments. > > This round should address all of Eric's comments from the previous round. > > Thanks, it appears to cover my review comments from the previous > round. I do have some additional comments on this round (which I could > have raised with the previous round if I had thought of them at the > time). > > > As explained in more detail in a reply to the review comment directly, > > I did not add an enum to 'struct add_opts', for 'force_new_branch' and > > 'checkout_existing_branch', but instead removed 'force_new_branch' > > from the struct as it's not required. > > Makes sense. In fact, I had thoughts along these lines during your > previous dwim-ery series. See my comments on patch 3/6. > > > The rest of the updates are mainly in the user facing messages, > > documentation and one added test. > > Interdiff below: > > The interdiff looks sane. Unfortunately, due to UI regressions, I'm > having second thoughts about whether this series is going in the right > direction. See my comments on patch 2/6. Thanks for your reviews of this series! As I mentioned in the reply there I'm going to see whether or not I can fix those regressions (hopefully I can :)), and send a re-roll.
Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly
Hi Peff, On Fri, 30 Mar 2018, Jeff King wrote: > On Fri, Mar 30, 2018 at 03:02:00PM +0200, Johannes Schindelin wrote: > > > > I'm not sure I understand that last paragraph. What does flockfile() have > > > to do with stdin/stdout? > > > > > > The point of those calls is that we're locking the FILE handle, so that > > > it's safe for the lower-level config code to run getc_unlocked(), which > > > is faster. > > > > > > So without those, we're calling getc_unlocked() without holding the > > > lock. I think it probably works in practice because we know that we're > > > single-threaded, but it seems a bit sketchy. > > > > Oops. I misunderstood the purpose of flockfile(), then. I thought it was > > only about multiple users of stdin/stdout. > > > > Will have a look whether flockfile()/funlockfile() can be moved into > > do_config_from_file() instead. > > In a sense stdin/stdout are much more susceptible to this because > they're global variables, and any thread may touch them. For the config > code, we open our own handle that we don't expose elsewhere. So probably > it would be fine just to use the unlocked variants even without locking. > > But IMHO it's good practice to always flockfile() before using the > unlocked variants. My reading of POSIX is that it's OK to use the > unlocked variants without holding the lock (if you know there won't be > contention), but if it's not hard to err on the side of safety, I'd > prefer it. You know what is *really* funny? -- snip -- static int git_config_from_stdin(config_fn_t fn, void *data) { return do_config_from_file(fn, CONFIG_ORIGIN_STDIN, "", NULL, stdin, data, 0); } int git_config_from_file(config_fn_t fn, const char *filename, void *data) { int ret = -1; FILE *f; f = fopen_or_warn(filename, "r"); if (f) { flockfile(f); ret = do_config_from_file(fn, CONFIG_ORIGIN_FILE, filename, filename, f, data, 0); funlockfile(f); fclose(f); } return ret; } -- snap -- So the _stdin variant *goes out of its way not to flockfile()*... But I guess all this will become moot when I start handing down the config options. It does mean that I have to change the signatures in header files, oh well ;-) But then I can drop this here patch and we can stop musing about flockfile() ;-) Ciao, Dscho
Re: [PATCH v5 5/6] worktree: teach "add" to check out existing branches
On 03/27, Eric Sunshine wrote: > On Sun, Mar 25, 2018 at 9:49 AM, Thomas Gummererwrote: > > Currently 'git worktree add ' creates a new branch named after the > > basename of the path by default. If a branch with that name already > > exists, the command refuses to do anything, unless the '--force' option > > is given. > > > > However we can do a little better than that, and check the branch out if > > it is not checked out anywhere else. This will help users who just want > > to check an existing branch out into a new worktree, and save a few > > keystrokes. > > [...] > > Signed-off-by: Thomas Gummerer > > --- > > diff --git a/builtin/worktree.c b/builtin/worktree.c > > @@ -317,7 +318,10 @@ static int add_worktree(const char *path, const char > > *refname, > > - if (opts->new_branch) > > + if (opts->checkout_existing_branch) > > + fprintf_ln(stderr, _("checking out branch '%s'"), > > + refname); > > This fprintf_ln() can easily fit on one line; no need to wrap > 'refname' to the next one. > > Not worth a re-roll, though. > > > + else if (opts->new_branch) > > fprintf_ln(stderr, _("creating branch '%s'"), > > opts->new_branch); > > diff --git a/t/t2025-worktree-add.sh b/t/t2025-worktree-add.sh > > @@ -198,13 +198,26 @@ test_expect_success '"add" with omitted' ' > > -test_expect_success '"add" auto-vivify does not clobber existing branch' ' > > +test_expect_success '"add" checks out existing branch of dwimd name' ' > > test_commit c1 && > > test_commit c2 && > > git branch precious HEAD~1 && > > - test_must_fail git worktree add precious && > > + git worktree add precious && > > test_cmp_rev HEAD~1 precious && > > - test_path_is_missing precious > > + ( > > + cd precious && > > + test_cmp_rev precious HEAD > > + ) > > +' > > Looking at this more closely, once it stops being a "don't clobber > precious branch" test, it's doing more than it needs to do, thus is > confusing for future readers. The setup -- creating two commits and > making "precious" point one commit back -- was very intentional and > allowed the test to verify not only that the worktree wasn't created > but that "precious" didn't change to reference a different commit. > However, _none_ of this matters once it becomes a dwim'ing test; the > unnecessary code is confusing since it doesn't make sense in the > context of dwim'ing. I _think_ the entire test can collapse to: > > git branch existing && > git worktree add existing && > ( > cd existing && > test_cmp_rev existing HEAD > ) > > In other words, this patch should drop the "precious" test altogether > and just introduce a new dwim'ing test (and drop patch 6/6). > > I do think that with the potential confusion to future readers, this > does deserve a re-roll. Hmm I do think that it's important to have the branch we create the new working tree from different from the branch of the main working tree, to make sure we don't accidentally overwrite an existing branch, and to show that the new branch is checked out. Maybe I'm overly paranoid though? > > +test_expect_success '"add" auto-vivify fails with checked out branch' ' > > + git checkout -b test-branch && > > + test_must_fail git worktree add test-branch && > > + test_path_is_missing test-branch > > +' > > + > > +test_expect_success '"add --force" with existing dwimd name doesnt die' ' > > + git worktree add --force test-branch > > '
Re: [PATCH] builtin/config.c: prefer `--type=bool` over `--bool`, etc.
On Fri, Mar 30, 2018 at 01:27:19AM -0400, Taylor Blau wrote: > > If you really want to go all-out, I think the ACTION flags could use the > > same cleanup. We treat them as bitflags, and then issue an error when > > you set more than one, which is just silly. > > Agreed, and I think that this is a good candidate for a future patch. > Thoughts? :-). I actually worked this up for fun, though I had second thoughts while writing the commit message. Besides the cleanup, my primary motivation was following last-one-wins rules as we often do elsewhere. But actually, last-one-wins applies only to a _single_ option, not necessarily unrelated ones. Many other multi-action commands actually have a series of separate boolean flags, and then complain when more than one of the flags is set. So maybe it's not such a good idea for the actions (I do still think it's the right path for the types). For reference, here's the patch I wrote: builtin/config.c | 137 +-- 1 file changed, 72 insertions(+), 65 deletions(-) diff --git a/builtin/config.c b/builtin/config.c index 01169dd628..5581f48ac8 100644 --- a/builtin/config.c +++ b/builtin/config.c @@ -25,35 +25,36 @@ static char term = '\n'; static int use_global_config, use_system_config, use_local_config; static struct git_config_source given_config_source; -static int actions, types; +static int types; static int end_null; static int respect_includes_opt = -1; static struct config_options config_options; static int show_origin; -#define ACTION_GET (1<<0) -#define ACTION_GET_ALL (1<<1) -#define ACTION_GET_REGEXP (1<<2) -#define ACTION_REPLACE_ALL (1<<3) -#define ACTION_ADD (1<<4) -#define ACTION_UNSET (1<<5) -#define ACTION_UNSET_ALL (1<<6) -#define ACTION_RENAME_SECTION (1<<7) -#define ACTION_REMOVE_SECTION (1<<8) -#define ACTION_LIST (1<<9) -#define ACTION_EDIT (1<<10) -#define ACTION_SET (1<<11) -#define ACTION_SET_ALL (1<<12) -#define ACTION_GET_COLOR (1<<13) -#define ACTION_GET_COLORBOOL (1<<14) -#define ACTION_GET_URLMATCH (1<<15) - +enum config_action { + ACTION_NONE = 0, + ACTION_GET, + ACTION_GET_ALL, + ACTION_GET_REGEXP, + ACTION_REPLACE_ALL, + ACTION_ADD, + ACTION_UNSET, + ACTION_UNSET_ALL, + ACTION_RENAME_SECTION, + ACTION_REMOVE_SECTION, + ACTION_LIST, + ACTION_EDIT, + ACTION_SET, + ACTION_SET_ALL, + ACTION_GET_COLOR, + ACTION_GET_COLORBOOL, + ACTION_GET_URLMATCH, +}; /* - * The actions "ACTION_LIST | ACTION_GET_*" which may produce more than - * one line of output and which should therefore be paged. + * This must be an int and not an enum because we pass it by address + * to OPT_SETINT. */ -#define PAGING_ACTIONS (ACTION_LIST | ACTION_GET_ALL | \ - ACTION_GET_REGEXP | ACTION_GET_URLMATCH) +static int action; #define TYPE_BOOL (1<<0) #define TYPE_INT (1<<1) @@ -69,20 +70,20 @@ static struct option builtin_config_options[] = { OPT_STRING('f', "file", _config_source.file, N_("file"), N_("use given config file")), OPT_STRING(0, "blob", _config_source.blob, N_("blob-id"), N_("read config from given blob object")), OPT_GROUP(N_("Action")), - OPT_BIT(0, "get", , N_("get value: name [value-regex]"), ACTION_GET), - OPT_BIT(0, "get-all", , N_("get all values: key [value-regex]"), ACTION_GET_ALL), - OPT_BIT(0, "get-regexp", , N_("get values for regexp: name-regex [value-regex]"), ACTION_GET_REGEXP), - OPT_BIT(0, "get-urlmatch", , N_("get value specific for the URL: section[.var] URL"), ACTION_GET_URLMATCH), - OPT_BIT(0, "replace-all", , N_("replace all matching variables: name value [value_regex]"), ACTION_REPLACE_ALL), - OPT_BIT(0, "add", , N_("add a new variable: name value"), ACTION_ADD), - OPT_BIT(0, "unset", , N_("remove a variable: name [value-regex]"), ACTION_UNSET), - OPT_BIT(0, "unset-all", , N_("remove all matches: name [value-regex]"), ACTION_UNSET_ALL), - OPT_BIT(0, "rename-section", , N_("rename section: old-name new-name"), ACTION_RENAME_SECTION), - OPT_BIT(0, "remove-section", , N_("remove a section: name"), ACTION_REMOVE_SECTION), - OPT_BIT('l', "list", , N_("list all"), ACTION_LIST), - OPT_BIT('e', "edit", , N_("open an editor"), ACTION_EDIT), - OPT_BIT(0, "get-color", , N_("find the color configured: slot [default]"), ACTION_GET_COLOR), - OPT_BIT(0, "get-colorbool", , N_("find the color setting: slot [stdout-is-tty]"), ACTION_GET_COLORBOOL), + OPT_SET_INT(0, "get", , N_("get value: name [value-regex]"), ACTION_GET), + OPT_SET_INT(0, "get-all", , N_("get all values: key [value-regex]"), ACTION_GET_ALL), + OPT_SET_INT(0, "get-regexp", , N_("get values for regexp: name-regex [value-regex]"), ACTION_GET_REGEXP), + OPT_SET_INT(0, "get-urlmatch", , N_("get value specific for the URL: section[.var] URL"),
Re: [PATCH v5 3/6] worktree: remove force_new_branch from struct add_opts
On 03/27, Eric Sunshine wrote: > On Sun, Mar 25, 2018 at 9:49 AM, Thomas Gummererwrote: > > The 'force_new_branch' flag in 'struct add_opts' is only used inside the > > add function, where we already have the same information stored in the > > 'new_branch_force' variable. Avoid that unnecessary duplication. > > When I was reviewing your original dwim-ery series (inferring 'foo' > from 'origin/foo'), I noticed that 'struct add_opts' had accumulated a > number of unnecessary members over time, including this one. My plan > was to submit a patch removing all those pointless members after your > dwim-ery series settled, however, I never got around to it. Ah right, I didn't look at them in detail, so I failed to notice that. While I'm already in the area I may as well do that, thanks for the suggestion! > This patch might be a good opportunity for doing that cleanup > wholesale, removing all unneeded members rather than just the one. (If > so, you'd probably want to move to this patch to the front of the > series as cleanup before the meatier changes.) Anyhow, it's just a > thought; feel free to respond with "it's outside the scope of this > series" if you're not interested in doing it. > > > Signed-off-by: Thomas Gummerer
Re: [PATCH v5 2/6] worktree: be clearer when "add" dwim-ery kicks in
On 03/27, Eric Sunshine wrote: > On Sun, Mar 25, 2018 at 9:49 AM, Thomas Gummererwrote: > > Currently there is no indication in the "git worktree add" output that > > a new branch was created. This would be especially useful information > > in the case where the dwim of "git worktree add " kicks in, as the > > user didn't explicitly ask for a new branch, but we create one from > > them. > > Failed to notice this last time around: s/from/for/ > > Perhaps Junio can tweak it when queuing. > > > Print some additional output showing that a branch was created and the > > branch name to help the user. > > > > This will also be useful in the next commit, which introduces a new kind > > It's no longer the _next_ commit which does this. Perhaps say instead > "a subsequent commit". (Again, perhaps Junio can tweak it since it's > not worth a re-roll.) Right, will fix. > > of dwim-ery of checking out the branch if it exists instead of refusing > > to create a new worktree in that case, and where it's nice to tell the > > user which kind of dwim-ery kicked in. > > > > Signed-off-by: Thomas Gummerer > > --- > > diff --git a/builtin/worktree.c b/builtin/worktree.c > > @@ -318,6 +318,9 @@ static int add_worktree(const char *path, const char > > *refname, > > + if (opts->new_branch) > > + fprintf_ln(stderr, _("creating branch '%s'"), > > opts->new_branch); > > I didn't think of this before, but what happens with the original > dwim-ery you added which checks out a remote branch matching the > worktree name if no such local branch exists? I was wondering if we'd > want to see a distinct and more informative message in that case, such > as "creating branch '%s' from remote '%s'" or something. However, I > see that we're already getting a message: > > Branch 'foo' set up to track remote branch 'foo' from 'origin'. > creating branch 'foo' > new worktree HEAD is now at d3adb33f boople > > which is a bit weird since tracking appears to be set up before the > branch itself is created. I thought that this was another UI > regression of the sort I mentioned in [1], however, the messages were > out of order even before the current patch series: Yeah I agree this is a bit odd. I did think about this, and the main reason why I didn't change this by passing the '--quiet' flag to 'git branch' which we call in the 'add()' function is to keep the information about the remote tracking. Looking at where this information actually comes from, there's multiple different ways we'd have to distinguish to print this information properly. Now that I put some more thought into this, I think there are a couple of ways to solve this, the easiest and cleanest of which would probably be to run 'git branch' through 'pipe_command', save the output and print it after the "creating branch 'foo'" message. > Branch 'foo' set up to track remote branch 'foo' from 'origin'. > Preparing foo (identifier foo) > Checking out files: 100% (/), done. > HEAD is now at d3adb33f boople > > This highlights another regression introduced by this series. Patch > 1/6 suppresses the "Checking out files:" message, which is a bit > unfortunate for decent sized worktrees. I'm not sure I'm entirely > happy about that loss. Thoughts? Tbh I never really noticed the "Checking out files" output, because the repositories I used day to day are usually of a smaller size, where checking out the files takes less than a second, so I would never even notice this output. But while I don't care about it, others may, so I think it would be better to preserve this output. Maybe the best way to do that would be to not use 'run_command' to run 'git reset --hard', but to instead replace that with the internal functions. I haven't actually looked at how hard this would be, but I'm guessing this may be the best option to avoid this UI regression. I'll play with this some more and see what I can come up with, and send a re-roll hopefully in a few days. Thanks for your review! > [1]: > https://public-inbox.org/git/CAPig+cT8i9L9kbhx=b0sg4_qyndoedpw-1xypm9xzbqpmqr...@mail.gmail.com/ > > > fprintf(stderr, _("new worktree HEAD is now at %s"), > > find_unique_abbrev(commit->object.oid.hash, > > DEFAULT_ABBREV));
Re: [PATCH v2 2/2] builtin/config.c: prefer `--type=bool` over `--bool`, etc.
On Fri, Mar 30, 2018 at 01:28:30AM -0400, Taylor Blau wrote: > +static int option_parse_type(const struct option *opt, const char *arg, > + int unset) > +{ > + int *type = opt->value; > + > + if (!strcmp(arg, "bool")) > + *type = TYPE_BOOL; > + else if (!strcmp(arg, "int")) > + *type = TYPE_INT; > + else if (!strcmp(arg, "bool-or-int")) > + *type = TYPE_BOOL_OR_INT; > + else if (!strcmp(arg, "path")) > + *type = TYPE_PATH; > + else if (!strcmp(arg, "expiry-date")) > + *type = TYPE_EXPIRY_DATE; > + else { > + die(_("unexpected --type argument, %s"), arg); > + return 1; > + } > + return 0; > +} You need to handle "unset" here, which will trigger when somebody does: git config --no-type In which case you'd probably want to reset it to "0". > @@ -1622,4 +1623,21 @@ test_expect_success 'later legacy specifiers are given > precedence' ' > test_cmp expect actual > ' > > +test_expect_success '--type allows valid type specifiers' ' > + echo "true" >expect && > + git config --type=bool core.foo >actual && > + test_cmp expect actual > +' > + > +test_expect_success '--type rejects unknown specifiers' ' > + test_must_fail git config --type=nonsense core.foo 2>error && > + test_i18ngrep "unexpected --type argument" error > +' > + > +test_expect_success 'later legacy specifiers are given precedence' ' > + git config --bool --int core.number >actual && > + echo 10 > expect && > + test_cmp expect actual > +' I think there's some rebasing funkiness with this last test, which already exists from patch 1. Other than those two minor issues and the typos that René noticed, this looks good to me. -Peff
Re: [PATCH v2 1/2] builtin/config.c: treat type specifiers singularly
On Fri, Mar 30, 2018 at 01:28:29AM -0400, Taylor Blau wrote: > Internally, we represent `git config`'s type specifiers as a bitset > using OPT_BIT. 'bool' is 1<<0, 'int' is 1<<1, and so on. This technique > allows for the representation of multiple type specifiers in the `int > types` field, but this multi-representation is left unused. > > In fact, `git config` will not accept multiple type specifiers at a > time, as indicated by: > > $ git config --int --bool some.section > error: only one type at a time. > > This patch uses `OPT_SET_INT` to prefer the _last_ mentioned type > specifier, so that the above command would instead be valid, and a > synonym of: > > $ git config --bool some.section > > This change is motivated by two urges: (1) it does not make sense to > represent a singular type specifier internally as a bitset, only to > complain when there are multiple bits in the set. `OPT_SET_INT` is more > well-suited to this task than `OPT_BIT` is. (2) a future patch will > introduce `--type=`, and we would like not to complain in the > following situation: > > $ git config --int --type=int I think you could just end here, since... > Where--although the legacy and modern type specifier (`--int`, and > `--type`, respectively) do not conflict--we would store the value of > `--type=` in a `char *` and the `--int` as a bit in the `int` bitset. > > In the above, when error-checking `if (types != && type_str)`, we do not > check additionally whether or not these types are the same, and simply > complain immediately. This change makes `--int` and `--type=int` a true > synonym of each other, and removes the need for additional complexity, > as above in the conditional. None of this type_str stuff exists yet, and you've sufficiently motivated the change. > Signed-off-by: Taylor Blau> --- > builtin/config.c | 39 +-- > t/t1300-repo-config.sh | 11 +++ > 2 files changed, 28 insertions(+), 22 deletions(-) The patch mostly looks good. We probably want to also rewrite the TYPE_* #defines to be sequential, since somebody may scratch their head wondering why we use one bit per type when we do not treat them as a bitfield. > diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh > index 4f8e6f5fd..aa45b9267 100755 > --- a/t/t1300-repo-config.sh > +++ b/t/t1300-repo-config.sh > @@ -1611,4 +1611,15 @@ test_expect_success '--local requires a repo' ' > test_expect_code 128 nongit git config --local foo.bar > ' > > +cat >.git/config <<-\EOF && > +[core] > +number = 10 > +EOF > + > +test_expect_success 'later legacy specifiers are given precedence' ' > + git config --bool --int core.number >actual && > + echo 10 > expect && > + test_cmp expect actual > +' Minor style nit: we prefer ">expect" with no space. Though t1300 is certainly a cornucopia of style violations already. I could buy it under the guise of matching existing style if there wasn't a counter-example directly above. :) We also prefer to put that "cat >.git/config" inside a test block, though I'm pretty sure that _is_ following existing style in the test. -Peff
Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)
Hi Sergey, On Wed, 28 Mar 2018, Sergey Organov wrote: > Johannes Schindelinwrites: > > > I use rebase every day. I use the Git garden shears every week. If you > > do not trust my experience with these things, nothing will convince > > you. > > Unfortunately you have exactly zero experience with rebasing merges as > you've never actually rebased them till now, and it's rebasing merges > that matters in this particular discussion. Who says that I have 0 experience with that? Oh yes, you do. Like, as if you know. Guess what I do with those Git garden shears' merges? Can you guess? Of course you can. But you'll never know until I tell you. It is a little silly to try to tell me that I do not have any experience with rebasing merges when you have no idea what strategies I tried in the past. Now, Phillip's strategy is clearly the best strategy I ever heard about, and I am in the process of doing Actual Work to Put It To The Test. > > You are just stuck with your pre-existing opinion. > > I'm afraid that it's rather your huge experience with re-creating merges > that makes you stuck to your pre-existing opinion and carefully shields > you from experiencing actual paradigm shift. You know what? Whatevs. Ciao, Johannes
Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)
Hi Sergey, On Fri, 30 Mar 2018, Sergey Organov wrote: > Could we please agree to stop using backward compatibility as an > objection in the discussion of the --recreate-merges feature? No. The expectation of users as to what a `pick` is has not changed just because you wish it would. That is a matter of backwards-compatibility. You see, if you are driving a car for a hundred years already, and then switch to a different car, and it has a lever in the same place as your previous car's windshield wiper, but in the new car it has a button that activates the emergency driver seat ejection OMG *it has a seat ejection like in the James Bond movies! Where can I get that car?* Sorry for disgressing. I am really concerned about that willingness to put an innocuous button, so to speak, onto something users got really used to, over the course of a decade or so, when that button should really be made red and blinking and OMG where can I get that car? So to reiterate, I am really interested in a practical solution that won't cause nasty surprises. Meaning: `pick` != merge. That was a mistake in preserve-merges, as I have only mentioned like a hundred times, and we won't repeat it. Now back to that important question: where can I get such a James Bond car? Ideally also with Turbo Boost. Oh wait, that was somebody else's car. Ciao, Johannes
Re: [PATCH v4 01/13] commit-graph: add format document
Derrick Stoleewrites: > +== graph-*.graph files have the following format: What is this '*' here? [...] > + The remaining data in the body is described one chunk at a time, and > + these chunks may be given in any order. Chunks are required unless > + otherwise specified. Does Git need to understand all chunks, or could there be optional chunks that can be safely ignored (like in PNG format)? Though this may be overkill, and could be left for later revision of the format if deemed necessary. > + > +CHUNK DATA: > + > + OID Fanout (ID: {'O', 'I', 'D', 'F'}) (256 * 4 bytes) > + The ith entry, F[i], stores the number of OIDs with first > + byte at most i. Thus F[255] stores the total > + number of commits (N). All right, it is small enough that can be required even for a very small number of commits. > + > + OID Lookup (ID: {'O', 'I', 'D', 'L'}) (N * H bytes) > + The OIDs for all commits in the graph, sorted in ascending order. > + > + Commit Data (ID: {'C', 'G', 'E', 'T' }) (N * (H + 16) bytes) Do commits need to be put here in the ascending order of OIDs? If so, this would mean that it is not possible to add information about new commits by only appending data and maybe overwriting some fields, I think. You would need to do full rewrite to insert new commit in appropriate place. > +* The first H bytes are for the OID of the root tree. > +* The next 8 bytes are for the int-ids of the first two parents > + of the ith commit. Stores value 0x if no parent in that > + position. If there are more than two parents, the second value > + has its most-significant bit on and the other bits store an array > + position into the Large Edge List chunk. > +* The next 8 bytes store the generation number of the commit and > + the commit time in seconds since EPOCH. The generation number > + uses the higher 30 bits of the first 4 bytes, while the commit > + time uses the 32 bits of the second 4 bytes, along with the lowest > + 2 bits of the lowest byte, storing the 33rd and 34th bit of the > + commit time. > + > + Large Edge List (ID: {'E', 'D', 'G', 'E'}) [Optional] > + This list of 4-byte values store the second through nth parents for > + all octopus merges. The second parent value in the commit data stores > + an array position within this list along with the most-significant bit > + on. Starting at that array position, iterate through this list of > int-ids > + for the parents until reaching a value with the most-significant bit > on. > + The other bits correspond to the int-id of the last parent. All right, that is one chunk that cannot use fixed-length records; this shouldn't matter much, as we iterate only up to the number of parents less two. A question: what happens to the last list of parents? Is there a guardian value of 0x at last place? > + > +TRAILER: > + > + H-byte HASH-checksum of all of the above. > + Best, -- Jakub Narębski
Re: [PATCH 9/9] git_config_set: reuse empty sections
Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:19:09PM +0200, Johannes Schindelin wrote: > > > It can happen quite easily that the last setting in a config section is > > removed, and to avoid confusion when there are comments in the config > > about that section, we keep a lone section header, i.e. an empty > > section. > > > > The code to add new entries in the config tries to be cute by reusing > > the parsing code that is used to retrieve config settings, but that > > poses the problem that the latter use case does *not* care about empty > > sections, therefore even the former user case won't see them. > > > > Fix this by introducing a mode where the parser reports also empty > > sections (with a trailing '.' as tell-tale), and then using that when > > adding new config entries. > > Heh, so it seems we are partway to the "event-stream" suggestion I made > earlier. I agree this is the right way to approach this problem. > > I wondered if we allow keys to end in ".", but it seems that we don't. > > > diff --git a/config.c b/config.c > > index eb1e0d335fc..b04c40f76bc 100644 > > --- a/config.c > > +++ b/config.c > > @@ -653,13 +653,15 @@ static int get_base_var(struct strbuf *name) > > } > > } > > > > -static int git_parse_source(config_fn_t fn, void *data) > > +static int git_parse_source(config_fn_t fn, void *data, > > + int include_section_headers) > > We already have a "struct config_options", but we do a terrible job of > passing it around (since it only impacts the include stuff right now, > and that all gets handled at a very outer level). > > Rather than plumb this one int through everywhere, should we add it to > that struct and plumb the struct through? Yesss! Again, thank you so much for this really valuable review. This is even better than what I hoped for. Ciao, Dscho
Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly
On Fri, Mar 30, 2018 at 03:02:00PM +0200, Johannes Schindelin wrote: > > I'm not sure I understand that last paragraph. What does flockfile() have > > to do with stdin/stdout? > > > > The point of those calls is that we're locking the FILE handle, so that > > it's safe for the lower-level config code to run getc_unlocked(), which > > is faster. > > > > So without those, we're calling getc_unlocked() without holding the > > lock. I think it probably works in practice because we know that we're > > single-threaded, but it seems a bit sketchy. > > Oops. I misunderstood the purpose of flockfile(), then. I thought it was > only about multiple users of stdin/stdout. > > Will have a look whether flockfile()/funlockfile() can be moved into > do_config_from_file() instead. In a sense stdin/stdout are much more susceptible to this because they're global variables, and any thread may touch them. For the config code, we open our own handle that we don't expose elsewhere. So probably it would be fine just to use the unlocked variants even without locking. But IMHO it's good practice to always flockfile() before using the unlocked variants. My reading of POSIX is that it's OK to use the unlocked variants without holding the lock (if you know there won't be contention), but if it's not hard to err on the side of safety, I'd prefer it. -Peff
Re: [PATCH 7/9] git config --unset: remove empty sections (in normal situations)
On Fri, Mar 30, 2018 at 03:00:06PM +0200, Johannes Schindelin wrote: > > I guess the holy grail would be a parser which reports _all_ syntactic > > events (section names, keys, comments, whitespace, etc) as a stream > > without storing anything. And then the normal reader could just discard > > the non-key events, and the writer here could build the tree from those > > events. > > I already changed the do_config_from_file()/do_config_from() code path to > allow for handing back section headers. And I *think* that approach should > be easily extended to allow for an optional callback for these syntactic > events (and we do not need more than that, as the parsed "tree" really is > a list: there is nothing nested about ini files, so we really only have a > linear list of blocks (event type, offset range)). True. I was thinking we'd want sections with keys, whitespace, and comments under them. But even that does not really make sense. As this patch series shows, comments do not "belong" to a section, and the file really needs to be considered as a stream. So yeah, if we can parse it into a sequence of events in one forward-pass and then manipulate that sequence, I think it should be sufficient (and _way_ more readable than the current code, even before the bits you are trying to fix here). > I'll think about this a little bit, and hopefully come back with v2 in a > while that uses that approach. > > Thank you so much for that suggestion, Great. Thanks for working on this. -Peff
Re: [PATCH 8/9] git_config_set: use do_config_from_file() directly
Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:19:04PM +0200, Johannes Schindelin wrote: > > > Technically, it is the git_config_set_multivar_in_file_gently() > > function that we modify here (but the oneline would get too long if we > > were that precise). > > > > This change prepares the git_config_set machinery to allow reusing empty > > sections, by using the file-local function do_config_from_file() > > directly (whose signature can then be changed without any effect outside > > of config.c). > > > > An incidental benefit is that we avoid a level of indirection, and we > > also avoid calling flockfile()/funlockfile() when we already know that > > we are not operating on stdin/stdout here. > > I'm not sure I understand that last paragraph. What does flockfile() have > to do with stdin/stdout? > > The point of those calls is that we're locking the FILE handle, so that > it's safe for the lower-level config code to run getc_unlocked(), which > is faster. > > So without those, we're calling getc_unlocked() without holding the > lock. I think it probably works in practice because we know that we're > single-threaded, but it seems a bit sketchy. Oops. I misunderstood the purpose of flockfile(), then. I thought it was only about multiple users of stdin/stdout. Will have a look whether flockfile()/funlockfile() can be moved into do_config_from_file() instead. Ciao, Dscho
Re: [PATCH 7/9] git config --unset: remove empty sections (in normal situations)
Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:19:00PM +0200, Johannes Schindelin wrote: > > > Let's generalize this observation to this conservative strategy: if we > > are removing the last entry from a section, and there are no comments > > inside that section nor surrounding it, then remove the entire section. > > Otherwise behave as before: leave the now-empty section (including those > > comments, even the one about the now-deleted entry). > > Yep, as I said earlier, this makes a ton of sense to me. > > [... thorough review ...] Thank you for taking the time (and figuring out my off-by-ones, am I not the king of those?). Your in-depth analysis of the backtracking approach also makes sense, in particular the awful bug that looks very, very similar to what 1/9 fixes elsewhere. I'll take some time to go over your comments in detail, but there is one suggestion that I think I'll want to pursue first: > I guess the holy grail would be a parser which reports _all_ syntactic > events (section names, keys, comments, whitespace, etc) as a stream > without storing anything. And then the normal reader could just discard > the non-key events, and the writer here could build the tree from those > events. I already changed the do_config_from_file()/do_config_from() code path to allow for handing back section headers. And I *think* that approach should be easily extended to allow for an optional callback for these syntactic events (and we do not need more than that, as the parsed "tree" really is a list: there is nothing nested about ini files, so we really only have a linear list of blocks (event type, offset range)). I'll think about this a little bit, and hopefully come back with v2 in a while that uses that approach. Thank you so much for that suggestion, Dscho
Re: [PATCH 4/9] t1300: remove unreasonable expectation from TODO
Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:50PM +0200, Johannes Schindelin wrote: > > > In https://public-inbox.org/git/7vvc8alzat@alter.siamese.dyndns.org/ > > a reasonable patch was made quite a bit less so by changing a test case > > demonstrating a bug to a test case that demonstrates that we ask for too > > much: the test case 'unsetting the last key in a section removes header' > > now expects a future bug fix to be able to determine whether a free-form > > comment above a section header refers to said section or not. > > > > Rather than shooting for the stars (and not even getting off the > > ground), let's start shooting for something obtainable and be reasonably > > confident that we *can* get it. > > As I said before, I'm fine with turning this test into something more > realistic. Good. Of course, I worked hard to come up with a patch series, i.e. I put in some effort to placate anybody who would be offended by my accompanying rant. > An obvious question is whether we should preserve the original > unrealistic parts by splitting it: the realistic parts into one > expect_failure (that we'd switch to expect_success by the end of this > series), and then an unrealistic one to serve as a documentation of the > ideal, with a comment explaining why it's unrealistic. As stated before, I think it would be a mistake to mark up this unrealistic example with `test_expect_failure`. We do, after all, suggest occasionally to grep for that when somebody asks what they could work on. And you do not want to set somebody like that up for failure by pointing them to such a "bug". However, I did keep the example to demonstrate the expectation that sections with surrounding comments are kept. That was very much intended. And the reason I did not change the unrealistic example? So that it is easier to review in our patch-based review process, where I try to avoid hunks that might distract from the intent of the change. Ciao, Dscho
Re: [PATCH 3/9] t1300: avoid relying on a bug
Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:45PM +0200, Johannes Schindelin wrote: > > > The test case 'unset with cont. lines' relied on a bug that is about to > > be fixed: it tests *explicitly* that removing the last entry from a > > config section leaves an *empty* section behind. > > > > Let's fix this test case not to rely on that behavior, simply by > > preventing the section from becoming empty. > > Seems like a good solution. I don't think we care in particular about > testing a multi-line value at the end of the file. ... and if we did, we should have documented that. Ciao, Dscho
Re: [PATCH 2/9] t1300: rename it to reflect that `repo-config` was deprecated
Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:40PM +0200, Johannes Schindelin wrote: > > > Signed-off-by: Johannes Schindelin> > --- > > t/{t1300-repo-config.sh => t1300-config.sh} | 0 > > 1 file changed, 0 insertions(+), 0 deletions(-) > > rename t/{t1300-repo-config.sh => t1300-config.sh} (100%) > > This has only been bugging me for oh, about 10 years. Yep. We should have done that right after moving the builtins' code to builtins/. Which reminds me that we *still* do not have a lib/ where all the source code for libgit.a lives. And then maybe standalone/ for the source code of the non-builtin tools. And... this would make for a fine micro-project next year, I guess. Or in ten. Ciao, Dscho
Re: [RFC] Rebasing merges: a jorney to the ultimate solution(RoadClear)
Hi Johannes, Johannes Schindelinwrites: > Hi, > > On Thu, 29 Mar 2018, Sergey Organov wrote: > >> Jacob Keller writes: >> >> > I care about the general compatibility of the rebase todo list >> > regardless of which options you enabled on the command line to >> > generate it. >> >> It's a good thing in general, yes. However, I recall I was told by the >> author that --recreate-merges was introduced exactly to break backward >> compatibility of the todo list. If so, could we please agree to stop >> using backward compatibility as an objection in the discussion of this >> particular feature? > > That is a serious misrepresentation of what I said. > > If I had changed --preserve-merges to the new format, *that* would have > broken backwards-compatibility. > > So the entire reason of introducing --recreate-merges was to *not have to > break backwards-compatibility*. > > I definitely did not say the *exact opposite*. I'm sorry I committed ambiguity in my wording that allowed it to be misinterpreted. I actually intended to say roughly the same thing you are saying, as what matters for the discussion is that new todo list format does not need to be (backward-)compatible to that of --preserve-merges. > Hopefully this clarifies your confusion, There was actually no confusion on my side, and I like your wording better. Except that you've managed to clarify your intentions without actually addressing the primary concern: Could we please agree to stop using backward compatibility as an objection in the discussion of the --recreate-merges feature? Could we? I understand you are still resistant to change 'pick' syntax, but it's not because of backward-compatibility, right? -- Sergey
Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug)
Hi Peff, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 05:18:30PM +0200, Johannes Schindelin wrote: > > > The first patch is somewhat of a "while at it" bug fix that I first > > thought would be a lot more critical than it actually is: It really > > only affects config files that start with a section followed > > immediately (i.e. without a newline) by a one-letter boolean setting > > (i.e. without a `= ` part). So while it is a real bug fix, I > > doubt anybody ever got bitten by it. > > That makes me wonder if somebody could craft a malicious config to do > something bad. I thought about that, and could not think of anything other than social engineering vectors. Even in that case, the error message is instructive enough that the user should be able to fix the config without consulting StackOverflow. > > Now, to the really important part: why does this patch series not > > conflict with my very early statements that we cannot simply remove > > empty sections because we may end up with stale comments? > > > > Well, the patch in question takes pains to determine *iff* there are > > any comments surrounding, or included in, the section. If any are > > found: previous behavior. Under the assumption that the user edited > > the file, we keep it as intact as possible (see below for some > > argument against this). If no comments are found, and let's face it, > > this is probably *the* common case, as few people edit their config > > files by hand these days (neither should they because it is too easy > > to end up with an unparseable one), the now-empty section *is* > > removed. > > I'm not against people editing their config files by hand. But I think > what you propose here makes a lot of sense, because it works as long as > you don't intermingle hand- and auto-editing in the same section (and it > even works if you do intermingle, as long as you don't use comments, > which are probably even more rare). > > So it seems like quite a sensible compromise, and I think should make > most people happy. Thanks for confirming my line of thinking, Dscho
Re: [PATCH 1/9] git_config_set: fix off-by-two
Hi, On Thu, 29 Mar 2018, Jeff King wrote: > On Thu, Mar 29, 2018 at 11:15:33AM -0700, Stefan Beller wrote: > > > > When calling `git config --unset abc.a` on this file, it leaves this > > > (invalid) config behind: > > > > > > [ > > > [xyz] > > > key = value > > > > > > The reason is that we try to search for the beginning of the line (or > > > for the end of the preceding section header on the same line) that > > > defines abc.a, but as an optimization, we subtract 2 from the offset > > > pointing just after the definition before we call > > > find_beginning_of_line(). That function, however, *also* performs that > > > optimization and promptly fails to find the section header correctly. > > > > This commit message would be more convincing if we had it in test form. > > I agree a test might be nice. But I don't find the commit message > unconvincing at all. It explains pretty clearly why the bug occurs, and > you can verify it by looking at find_beginning_of_line. > > > [abc]a > > > > is not written by Git, but would be written from an outside tool or person > > and we barely cope with it? > > Yes, I don't think git would ever write onto the same line. But clearly > we should handle anything that's syntactically valid. I was tempted to add the test case, because it is easy to test it. But I then decided *not* to add it. Why? Testing is a balance between "can do" and "need to do". Can you imagine that I did *not* run the entire test suite before submitting this patch series, because it takes an incredible *90 minutes* to run *on a fast Windows machine*? Seriously, this is hurting me. I do not complain about this due to some mental illness forcing me to do it. I complain about this so often *because it slows me down*, you gentle people. And you don't seem to care, at least the test suite gets noticably worse by the month. I frankly do not know what to do about this, as you keep adding and adding and it gets less and less feasible for me to run the full test suite. I seem to be totally unable to get through to you with the message that this is a real problem with a real need to get fixed. So with this in mind, I do not want to add a test case for a concocted example that won't affect anybody except users who *want* to trigger this bug. I hope you agree, Dscho P.S.: Of course I ran the entire test suite. Not on Windows, but in a Linux VM, because Linux is what Git is fine-tuned for, most obviously so. An alien digging up ancient Earth history in the far future might be tempted to assume that Git was developed to develop Linux which was developed to develop Git, and then ask herself why humans bothered at all. I actually ran the entire test suite on Linux on every single patch, via `git rebase -x "make -j15 DEVELOPER=1 test" @{u}`, as I usually do before submitting a patch series. And it *did* find an obscure bug in an earlier iteration, where t5512-ls-remote.sh demonstrated that looking at only one entry at a time is not enough: `git config --unset-all uploadpack.hiderefs` *also* needs to remove the now-empty section, because we might end up with the empty sections in the wrong order, and the order of [transfer] and [uploadpack] *matters* if the transfer.hiderefs setting is negative and the uploadpack.hiderefs setting is positive, as is the case in 'overrides work between mixed transfer/upload-pack hideRefs'. (Side-note: this looks like a pretty obvious design bug to me, as there is *no tooling* to switch around the order of these settings. Even worse: if somebody gets instructions to add those settings, and there is already a [transfer] section in the config: you're out of luck! You will have to *know* that the order matters, *and add a second [transfer] section manually*!)
Re: [PATCH 0/9] Assorted fixes for `git config` (including the "empty sections" bug)
Hi Stefan, On Thu, 29 Mar 2018, Stefan Beller wrote: > On Thu, Mar 29, 2018 at 8:18 AM, Johannes Schindelin >wrote: > > > So what is the argument against this extra care to detect comments? Well, if > > you have something like this: > > > > [section] > > ; Here we comment about the variable called snarf > > snarf = froop > > > > and we run `git config --unset section.snarf`, we end up with this config: > > > > [section] > > ; Here we comment about the variable called snarf > > > > which obviously does not make sense. However, that is already established > > behavior for quite a few years, and I do not even try to think of a way how > > this could be solved. > > By commenting out the key/value pair instead of deleting it. > It's called --unset, not --delete ;) That would open the door to new bug reports when a user starts with this concocted config: [section] # This is a comment about the `key` setting key = value and then does this: git config --unset section.key git config section.key value git config --unset section.key git config section.key value git config --unset section.key git config section.key value and then ends up with a config like this: [section] # This is a comment about the `key` setting ;key = value ;key = value ;key = value key = value And note that the comment might be about `value` instead, so reusing a commented-out `key` setting won't fly, either. I *did* give this problem a couple of minutes of thought before writing my assessment that is quoted above ;-) Ciao, Dscho
Re: [PATCH v4 00/13] Serialized Git Commit Graph
I hope that I am addressing the most recent version of this series. Derrick Stoleewrites: > As promised [1], this patch contains a way to serialize the commit graph. > The current implementation defines a new file format to store the graph > structure (parent relationships) and basic commit metadata (commit date, > root tree OID) in order to prevent parsing raw commits while performing > basic graph walks. For example, we do not need to parse the full commit > when performing these walks: > > * 'git log --topo-order -1000' walks all reachable commits to avoid > incorrect topological orders, but only needs the commit message for > the top 1000 commits. > > * 'git merge-base ' may walk many commits to find the correct > boundary between the commits reachable from A and those reachable > from B. No commit messages are needed. > > * 'git branch -vv' checks ahead/behind status for all local branches > compared to their upstream remote branches. This is essentially as > hard as computing merge bases for each. > > The current patch speeds up these calculations by injecting a check in > parse_commit_gently() to check if there is a graph file and using that > to provide the required metadata to the struct commit. That's nice. What are the assumptions about the serialized commit graph format? Does it need to be: - extensible without rewriting (e.g. append-only)? - like the above, but may need rewriting for optimal performance? - extending it needs to rewrite whole file? Excuse me if it waas already asked and answered. > > The file format has room to store generation numbers, which will be > provided as a patch after this framework is merged. Generation numbers > are referenced by the design document but not implemented in order to > make the current patch focus on the graph construction process. Once > that is stable, it will be easier to add generation numbers and make > graph walks aware of generation numbers one-by-one. As the serialized commit graph format is versioned, I wonder if it would be possible to speed up graph walks even more by adding to it FELINE index (pair of numbers) from "Reachability Queries in Very Large Graphs: A Fast Refined Olnine Search Approach" (2014) - available at http://openproceedings.org/EDBT/2014/paper_166.pdf The implementation would probably need adjustments to make it unambiguous and unambiguously extensible; unless there is place for indices that are local-only and need to be recalculated from scratch when graph changes (to cover all graph). > > Here are some performance results for a copy of the Linux repository > where 'master' has 704,766 reachable commits and is behind 'origin/master' > by 19,610 commits. > > | Command | Before | After | Rel % | > |--|||---| > | log --oneline --topo-order -1000 | 5.9s | 0.7s | -88% | > | branch -vv | 0.42s | 0.27s | -35% | > | rev-list --all | 6.4s | 1.0s | -84% | > | rev-list --all --objects | 32.6s | 27.6s | -15% | That's the "Rel %" of "Before", that is delta/before, isn't it? > To test this yourself, run the following on your repo: > > git config core.commitGraph true > git show-ref -s | git commit-graph write --set-latest --stdin-commits > > The second command writes a commit graph file containing every commit > reachable from your refs. Now, all git commands that walk commits will > check your graph first before consulting the ODB. You can run your own > performance comparisions by toggling the 'core.commitgraph' setting. Good. It is nicely similar to how bitmap indices (of reachability) are handled. I just wonder what happens in the (rare) presence of grafts (old mechanism), or "git replace"-d commits... Regards, -- Jakub Narębski
Re: [PATCH v3 0/3] add -p: select individual hunk lines
On 29/03/18 19:32, Junio C Hamano wrote: > Phillip Woodwrites: > >> From: Phillip Wood >> >> Since v2 I've updated the patches to use '-' instead of '^' to invert >> the selection to match the rest of add -i and clean -i. >> >> These patches build on top of the recount fixes in [1]. The commit >> message for the first patch describes the motivation: >> >> "When I end up editing hunks it is almost always because I want to >> stage a subset of the lines in the hunk. Doing this by editing the >> hunk is inconvenient and error prone (especially so if the patch is >> going to be reversed before being applied). Instead offer an option >> for add -p to stage individual lines. When the user presses 'l' the >> hunk is redrawn with labels by the insertions and deletions and they >> are prompted to enter a list of the lines they wish to stage. Ranges >> of lines may be specified using 'a-b' where either 'a' or 'b' may be >> omitted to mean all lines from 'a' to the end of the hunk or all lines >> from 1 upto and including 'b'." > > I haven't seen any review comments on this round, and as I am not a > heavy user of "add -i" interface (even though I admit that I > originally wrote it), I haven't had a chance to exercise the code > myself in the two weeks since the patches have been queued in my > tree. > > I am inclihned to merge them to 'next' soonish, but please stop me > if anybody (including the original author) has further comments. > > Thanks. > Hi Junio, if no one else has any comments, then I think it's ready for next. I've not used this latest incarnation much but I've used the previous versions quite a bit. Best Wishes Phillip