Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread Michael Haggerty
On 06/23/2015 02:50 AM, David Turner wrote:
 I've revived and modified Ronnie Sahlberg's work on the refs db
 backend.  
 
 The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle.
 I recognize that there have been changes to the refs code since then,
 and that there are some further changes in-flight from e.g. Michael
 Haggerty.  If there is interest in this, I can rebase once Michael's
 changes land.

It's awesome that you are working on this!

I'm reading through your commits and will add comments as they pop into
my head...

* I initially read refs-be-files to be a short version of references,
they be files. I might never be able to get that pronunciation out of
my head :-)

* It would be more modest to call the files implementing the LMDB
backend refs-be-lmdb.[c,h] rather than refs-be-db.[c,h].

* I wonder whether `refname_is_safe()` might eventually have to become
backend-specific. For example, maybe one backend will have to impose a
limit of 128 characters or something? No matter, though...it can be
moved later.

* You have put `format_reflog_msg()` in the public interface. It
probably makes sense, because more than one backend might want to use
it. But another backend might want to store (refname, old_sha1,
new_sha1, ...) as separate columns in a database. As long as
`format_reflog_msg()` is seen as a helper function and is not called by
any of the shared code, it shouldn't be a problem.

* I wonder whether `init_backend()` will be general enough. I'm thinking
by analogy with object constructors, which usually need class-specific
arguments during their initialization even though afterwards objects of
different classes can be used interchangeably. So I guess the idea is
that a typical `init_backend()` function will have to dig around itself
to find whatever additional information that it needs (e.g., from the
git configuration or the filesystem or whatever). So I think this is OK.

* Your methods for bulk updates are I think analogous to the
`initial_ref_transaction_commit()` function that I recently submitted
[1]. Either way, the goal is to abstract away the fact that the
file-based backend uses packed and loose references with tradeoffs that
callers currently have to know about.

* I don't like the fact that you have replaced `struct ref_transaction
*` with `void *` in the public interface. On a practical level, I like
the bit of type-safety that comes with the more specific declaration.
But on a more abstract level, I think that the concept of a transaction
could be useful across backends, for example in utility functions that
verify that a proposed set of updates are internally consistent. I would
rather see either

  * backends extend a basic `struct ref_transaction` to suit their
needs, and upcast/downcast pointers at the module boundary, or

  * `struct ref_transaction` itself gets a `void *` member that backends
can use for whatever purposes they want.

* Regarding MERGE_HEAD: you take the point of view that it must continue
to be stored as a file. And yet it must also behave somewhat like a
reference; for example, `git rev-parse MERGE_HEAD` works today.
MERGE_HEAD is also used for reachability, right?

Another point of view is that MERGE_HEAD is a plain old boring
reference, but there is some other metadata related to it that the refs
backend has to store. The file-based backend would have special-case
code to read the additional data from the tail of the loose refs file
(and be sure to write the metadata when writing the reference), but
other backends could store the reference with the rest but do their own
thing with the metadata. So I guess I'm wondering whether the refs API
needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with
its metadata.

* Don't the same considerations that apply to MERGE_HEAD also apply to
FETCH_HEAD?

* I'm showing my ignorance of LMDB, but I don't see where the LMDB
backend initializes its database during `git init-db`. Is that what
`init_env()` does? But it looks like `init_env()` is called on every git
invocation (via `git_config_early()`). Puzzled.

* Meanwhile, `create_default_files()` (in `builtin/init-db`) still
creates directories `refs`, `refs/heads`, and `refs/tags`.

* Rehash of the last two points: I expected one backend function that is
used to initialize the refs backend when a new repository is created
(e.g., in `git init`). The file-based backend would use this function to
create the `refs`, `refs/heads`, and `refs/tags` directories. I expected
a second function that is called once every time git runs in an existing
repository (this one might, for example, open a database connection).
And maybe even a third one that closes down the database connection
before git exits. Would you please explain how this actually works?

* `lmdb_init_backend()` leaks `path` if `env` is already set (in which
case it needn't compute `path` in the first place).

* You have the constraint that submodules need to use the same reference

Untracked files when git status executed on a new folder

2015-06-23 Thread Víctor Martín Hernández
Hi all.
Today I've had an unexpected behaviour that I'm not sure if is a bug or
I'm not doing git best practices... (surely the latest...)
The sequence of actions is :

1. create a new subfolder of my local repository branch
2. cd to this new folder, and create a new file
3. execute git status from the new folder

Doing that, the new folder doesn't appear as untracked.

4. cd ..
5. git status
In this case, the new folder appears.

If I create a new folder on the same level that the new one created in
step 1, cd into it, and execute git status, the folder created in step 1
appears as untracked.

After step 3 I have executed git commit and git push to the remote, and
later another user has realized that the new file was not present.

Please let me know if you understand my explanation, and if it's a bug
or just that I'm not as experienced as I'd like with git ;)

THANKS!!
Víctor

-- 
---
Víctor Martín Hernández

RD Software Engineer
Instituto de Ciencias del Espacio (ICE/CSIC), and 
Institut d'Estudis Espacials de Catalunya (IEEC)

Campus UAB, Carrer de Can Magrans, s/n
08193  Bellaterra (Cerdanyola del Vallès) - Barcelona
Tel. : +34 93 586 8782
Web:   http://gwart.ice.cat/

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] apply: fix adding new files on i-t-a entries

2015-06-23 Thread Nguyễn Thái Ngọc Duy
Since d95d728 (diff-lib.c: adjust position of i-t-a entries in diff -
2015-03-16), a normal git diff on i-t-a entries would produce a diff
that _adds_ those files, not just adding content to existing and empty
files like before.

This is correct. Unfortunately, applying such a patch on the same
repository that has the same i-t-a entries will fail with message
already exists in index because git-apply checks, sees those i-t-a
entries and aborts. git-apply does not realize those are for
bookkeeping only, they do not really exist in the index.

This patch tightens the exists in index check, ignoring i-t-a
entries. For fixing the above problem, only the change in
check_to_create() is needed. For other changes,

 - load_current(), reporting not exists in index is better than does
   not match index

 - check_preimage(), similar to load_current(), but it may also use
   ce_mode from i-t-a entry which is always zero

 - get_current_sha1(), or actually build_fake_ancestor(), we should not
   add i-t-a entries to the temporary index, at least not without also
   adding i-t-a flag back

Since git add -p and git am use git apply underneath, they are
broken too before this patch and fixed now.

Reported-by: Patrick Higgins phigg...@google.com
Reported-by: Bjørnar Snoksrud snoks...@gmail.com
Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com
---
 I think I'm opening a can of worms with d95d728. There's nothing
 wrong with that patch per se, but with this issue popping up, I need
 to go over all {cache,index}_name_pos call sites and see what would be
 the sensible behavior when i-t-a entries are involved.
 
 So far blame, rm and checkout-entry and checkout paths are on my
 to-think-or-fix list. But this patch can get in early to fix a real
 regression instead of waiting for one big series. A lot more
 discussions will be had before that series gets in good shape.

 builtin/apply.c   |  8 
 cache.h   |  2 ++
 read-cache.c  | 12 
 t/t2203-add-intent.sh | 10 ++
 4 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/builtin/apply.c b/builtin/apply.c
index 146be97..4f813ac 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3344,7 +3344,7 @@ static int load_current(struct image *image, struct patch 
*patch)
if (!patch-is_new)
die(BUG: patch to %s is not a creation, patch-old_name);
 
-   pos = cache_name_pos(name, strlen(name));
+   pos = exists_in_cache(name, strlen(name));
if (pos  0)
return error(_(%s: does not exist in index), name);
ce = active_cache[pos];
@@ -3497,7 +3497,7 @@ static int check_preimage(struct patch *patch, struct 
cache_entry **ce, struct s
}
 
if (check_index  !previous) {
-   int pos = cache_name_pos(old_name, strlen(old_name));
+   int pos = exists_in_cache(old_name, strlen(old_name));
if (pos  0) {
if (patch-is_new  0)
goto is_new;
@@ -3551,7 +3551,7 @@ static int check_to_create(const char *new_name, int 
ok_if_exists)
struct stat nst;
 
if (check_index 
-   cache_name_pos(new_name, strlen(new_name)) = 0 
+   exists_in_cache(new_name, strlen(new_name)) = 0 
!ok_if_exists)
return EXISTS_IN_INDEX;
if (cached)
@@ -3824,7 +3824,7 @@ static int get_current_sha1(const char *path, unsigned 
char *sha1)
 
if (read_cache()  0)
return -1;
-   pos = cache_name_pos(path, strlen(path));
+   pos = exists_in_cache(path, strlen(path));
if (pos  0)
return -1;
hashcpy(sha1, active_cache[pos]-sha1);
diff --git a/cache.h b/cache.h
index 571c98f..6a44cb6 100644
--- a/cache.h
+++ b/cache.h
@@ -341,6 +341,7 @@ extern void free_name_hash(struct index_state *istate);
 #define discard_cache() discard_index(the_index)
 #define unmerged_cache() unmerged_index(the_index)
 #define cache_name_pos(name, namelen) 
index_name_pos(the_index,(name),(namelen))
+#define exists_in_cache(name, namelen) 
exists_in_index(the_index,(name),(namelen))
 #define add_cache_entry(ce, option) add_index_entry(the_index, (ce), (option))
 #define rename_cache_entry_at(pos, new_name) rename_index_entry_at(the_index, 
(pos), (new_name))
 #define remove_cache_entry_at(pos) remove_index_entry_at(the_index, (pos))
@@ -512,6 +513,7 @@ extern int verify_path(const char *path);
 extern struct cache_entry *index_dir_exists(struct index_state *istate, const 
char *name, int namelen);
 extern struct cache_entry *index_file_exists(struct index_state *istate, const 
char *name, int namelen, int igncase);
 extern int index_name_pos(const struct index_state *, const char *name, int 
namelen);
+extern int exists_in_index(const struct index_state *, const char *name, int 
namelen);
 #define ADD_CACHE_OK_TO_ADD 1  /* Ok to add */
 #define ADD_CACHE_OK_TO_REPLACE 2  /* Ok to replace 

Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread Duy Nguyen
On Tue, Jun 23, 2015 at 6:47 PM, Jeff King p...@peff.net wrote:
 I was thinking of actually moving to a log-structured ref storage.
 Something like:

   - any ref write puts a line at the end of a single logfile that
 contains the ref name, along with the normal reflog data

   - the logfile is the source of truth for the ref state. If you want to
 know the value of any ref, you can read it backwards to find the
 last entry for the ref. Everything else is an optimization.

 Let's call the number of refs N, and the number of ref updates in
 the log U.

   - we keep a key/value index mapping the name of any branch that exists
 to the byte offset of its entry in the logfile. This would probably

One key/value mapping per branch, pointing to the latest reflog entry,
or one key/valye mapping for each reflog entry?

 be in some binary key/value store (like LMDB). Without this,
 resolving a ref is O(U), which is horrible. With it, it should be
 O(1) or O(lg N), depending on the index data structure.

I'm thinking of the user with small or medium repos, in terms of refs,
who does not want an extra dependency. If we store one mapping per
branch, then the size of this mapping is small enough that the index
in a text file is ok. If we also store the offset to the previous
reflog entry of the same branch in the current reflog entry, like a
back pointer, then we could jump back faster.

Or do you have something else in mind? Current reflog structure won't
work because I think you bring back the reflog graveyard with this,
and I don't want to lose that

   - the index can also contain other optimizations. E.g., rather than
 point to the entry in the logfile, it can include the sha1 directly
 (to avoid an extra level of indirection). It may want to include the
 peeled value, as the current packed-refs file does.

   - Reading all of the reflogs (e.g., for repacking) is O(U), just like
 it is today. Except the storage for the logfile is a lot more
 compact than what we store today, with one reflog per file.

   - Reading a single reflog is _also_ O(U), which is not as good as
 today. But if each log entry contains a byte offset of the previous
 entry, you can follow the chain (it is still slightly worse, because
 you are jumping all over the file, rather than reading a compact set
 of lines).

   - Pruning the reflog entries from the logfile requires rewriting the
 whole thing. That's similar to today, where we rewrite each of the
 reflog files.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jun 2015, #05; Mon, 22)

2015-06-23 Thread Johannes Schindelin
Hi Junio,

On 2015-06-23 00:49, Junio C Hamano wrote:

 * js/rebase-i-clean-up-upon-continue-to-skip (2015-06-18) 3 commits
  - rebase -i: do not leave a CHERRY_PICK_HEAD file behind
  - SQUASH: test_must_fail is a shell function
  - t3404: demonstrate CHERRY_PICK_HEAD bug
 
  Abandoning an already applied change in git rebase -i with
  --continue left CHERRY_PICK_HEAD and confused later steps.
 
  Expecting a reroll.
  ($gmane/271856)

Actually, there had been two re-rolls, and v3 seemed to be okay: $gmane/272037. 
It also looks as if 726a35ebd^..726a35ebd^2 is identical with v3... Anything 
you want me to change in addition?

Also:

 * js/fsck-opt (2015-06-22) 19 commits
  - fsck: support ignoring objects in `git fsck` via fsck.skiplist
  - fsck: git receive-pack: support excluding objects from fsck'ing
  - fsck: introduce `git fsck --connectivity-only`
  - fsck: support demoting errors to warnings
  - fsck: document the new receive.fsck.msg-id options
  - fsck: allow upgrading fsck warnings to errors
  - fsck: optionally ignore specific fsck issues completely
  - fsck: disallow demoting grave fsck errors to warnings
  - fsck: add a simple test for receive.fsck.msg-id
  - fsck: make fsck_tag() warn-friendly
  - fsck: handle multiple authors in commits specially
  - fsck: make fsck_commit() warn-friendly
  - fsck: make fsck_ident() warn-friendly
  - fsck: report the ID of the error/warning
  - fsck (receive-pack): allow demoting errors to warnings
  - fsck: offer a function to demote fsck errors to warnings
  - fsck: provide a function to parse fsck message IDs
  - fsck: introduce identifiers for fsck messages
  - fsck: introduce fsck options
 
  Rerolled (at v7) and seems more or less ready for 'next'.

I see that you used `downcased` instead of my `lowercased`, which makes more 
sense, but the style of the multi-line `for` loop as per `pu` is still as *I* 
wrote it... I also saw that you downcased the first letter after `fsck:` in the 
commit messages, and touched up the message of the report the ID of the 
error/warning commit. Do you want to touch up the `for` loop style in offer a 
function to demote fsck errors to warnings or shall I send a v8 (it is ready 
to go: https://github.com/dscho/git/compare/next...fsck-api)?

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 12/19] initial_ref_transaction_commit(): check for duplicate refs

2015-06-23 Thread Junio C Hamano
Michael Haggerty mhag...@alum.mit.edu writes:

 On 06/22/2015 11:06 PM, Junio C Hamano wrote:
 ...
 What I am wondering is if we could turn the safety logic that appear
 here (i.e. no existing refs must be assumed by the set of updates,
 etc.)  into an optimization cue and implement this as a special case
 helper to ref_transaction_commit(), i.e.
 
  ref_transaction_commit(...)
 {
  if (updates are all initial creation 
 no existing refs in repository)
  return initial_ref_transaction_commit(...);
  /* otherwise we do the usual thing */
  ...
  }
 
 and have clone call ref_transaction_commit() as usual.

 The safety logic in this function is (approximately) necessary, but not
 sufficient, to guarantee safety.

Oh, no question about it, and you do not even have to bring up an
insane user runs random commands while Git is hard working on it
non use-case ;-)

 One of the shortcuts that it takes is
 not locking the references while they are being created. Therefore, it
 would be unsafe for one process to call ref_transaction_commit() while
 another is calling initial_ref_transaction_commit(). So the caller has
 to know somehow that no other processes are working in the repository
 for this optimization to be safe. It conveys that knowledge by calling
 initial_ref_transaction_commit() rather than ref_transaction_commit().

OK.  So the answer to my first question is the initial creation
logic too fragile is a resounding yes; the caller should know
that it is too crazy for the user to be competing with what it is
doing before deciding to call initial_ref_transaction_commit(),
hence we cannot automatically detect if it is safe from within
ref_transaction_commit() to use this logic as an optimization.

 But I think if anything it would make more sense to go the other direction:

 * Teach ref_transaction_commit() an option that asks it to write
   references updates to packed-refs instead of loose refs (but
   locking the references as usual).

 * Change clone to use ref_transaction_commit() like everybody
   else, passing it the new REFS_WRITE_TO_PACKED_REFS option.

 Then clone would participate in the normal locking protocol, and it
 wouldn't *matter* if another process runs before the clone is finished.

Yeah, I thought that was actually I was driving at, and doing so
without that write-to-packed-refs option, which I'd prefer to leave
it as an optimization inside ref_transaction_commit().

Except that I missed that the initial_* variant is even more
aggressive (i.e. not locking), so no such optimization is safe.

 There would also be some consistency benefits. For example, if
 core.logallrefupdates is set globally or on the command line, the
 initial reference creations would be reflogged. And other operations
 that write references in bulk could use the new
 REFS_WRITE_TO_PACKED_REFS option to prevent loose reference proliferation.

 But I don't think any of this is a problem in practice, and I think we
 can live with using the optimized-but-not-100%-safe
 initial_ref_transaction_commit() for cloning.

OK.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] apply: fix adding new files on i-t-a entries

2015-06-23 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 Since d95d728 (diff-lib.c: adjust position of i-t-a entries in diff -
 2015-03-16), a normal git diff on i-t-a entries would produce a diff
 that _adds_ those files, not just adding content to existing and empty
 files like before.

 This is correct. Unfortunately, applying such a patch on the same
 repository that has the same i-t-a entries will fail with message
 already exists in index because git-apply checks, sees those i-t-a
 entries and aborts. git-apply does not realize those are for
 bookkeeping only, they do not really exist in the index.

 This patch tightens the exists in index check, ignoring i-t-a
 entries. For fixing the above problem, only the change in
 check_to_create() is needed.

And the first thing I noticed and found a bit disturbing was that
this change (which I think is correct, and happens to match what I
sent out earlier) was the only thing necessary to make your new test
pass.  IOW, the other changes in this patch have no test coverage.

 For other changes,

  - load_current(), reporting not exists in index is better than does
not match index

Is that error reporting the only side effect from this change?

This is only used when falling back to three-way merge while
applying a creation patch.

  - check_preimage(), similar to load_current(), but it may also use
ce_mode from i-t-a entry which is always zero

This is for the normal (non three-way) application and the idea is
the same as load_current() as you said above.

  - get_current_sha1(), or actually build_fake_ancestor(), we should not
add i-t-a entries to the temporary index, at least not without also
adding i-t-a flag back

This is part of am three-way fallback codepath.  I do not think
the merge-recursive three-way merge code knows and cares about, is
capable of handling, or would even want to deal with i-t-a entries
in the first place, so adding an entry as i-t-a bit would not help.
What the ultimate caller wants from us in this codepath is a tree
object, and that is written out from the temporary index---and that
codepath ignores i-t-a entries, so it is correct to omit them from
the temporary index in the first place.  Unlike the previous two
changes, I think this change deserves a new test.

  I think I'm opening a can of worms with d95d728. There's nothing
  wrong with that patch per se, but with this issue popping up, I need
  to go over all {cache,index}_name_pos call sites and see what would be
  the sensible behavior when i-t-a entries are involved.

Yeah, I agree that d95d728 should have been a part of a larger
series that changes the world order, instead of a single change that
brings inconsistency to the system.

I cannot offhand convince myself that apply is the only casualty;
assuming it is, I think a reasonable way forward is to keep d95d728
and adjust apply to the new world order.  Otherwise, i.e. if there
are wider fallouts from d95d728, we may instead want to temporarily
revert it off from 'master', deal with fallouts to apply and other
things, before resurrecting it.

Anything that internally uses diff-index is suspect, I think.

What do others think?  You seem to ...

  So far blame, rm and checkout-entry and checkout paths are on my
  to-think-or-fix list. But this patch can get in early to fix a real
  regression instead of waiting for one big series. A lot more
  discussions will be had before that series gets in good shape.

... think that the damage could be quite extensive, so I am inclined
to say that we first revert d95d728 before going forward.

  builtin/apply.c   |  8 
  cache.h   |  2 ++
  read-cache.c  | 12 
  t/t2203-add-intent.sh | 10 ++
  4 files changed, 28 insertions(+), 4 deletions(-)

 diff --git a/builtin/apply.c b/builtin/apply.c
 index 146be97..4f813ac 100644
 --- a/builtin/apply.c
 +++ b/builtin/apply.c
 @@ -3344,7 +3344,7 @@ static int load_current(struct image *image, struct 
 patch *patch)
   if (!patch-is_new)
   die(BUG: patch to %s is not a creation, patch-old_name);
  
 - pos = cache_name_pos(name, strlen(name));
 + pos = exists_in_cache(name, strlen(name));

Something that is named as if it would return yes/no that returns a
real value is not a very welcome idea.

 +/* This is the same as index_name_pos, except that i-t-a entries are 
 invisible */
 +int exists_in_index(const struct index_state *istate, const char *name, int 
 namelen)
 +{
 + int pos = index_name_stage_pos(istate, name, namelen, 0);
 +
 + if (pos  0)
 + return pos;
 + if (istate-cache[pos]-ce_flags  CE_INTENT_TO_ADD)
 + return -pos-1;

This is a useless complexity.  Your callers cannot use the returned
value like this:

pos = exists_in_cache(...);
if (pos  0) {
if (active_cache[-pos-1]-ce_flags  CE_INTENT_TO_ADD)
; /* ah it actually exists but it is i-t-a */

Re: Dependency Management

2015-06-23 Thread Stefan Beller
On Tue, Jun 23, 2015 at 1:52 AM, Jean Audibert jaudib...@euronext.com wrote:
 Hi,

 Sorry to bother you with this question but I can't find any official answer 
 or strong opinion from Git community.

 In my company we recently started to use Git and we wonder how to share code 
 and manage dependencies with Git?
 Use case: in project P we need to include lib-a and lib-b (libraries shared 
 by several projects)

 In your opinion, what is the future proof solution?
 * Use submodule
 * Use subtree

 We know there is lot of PRO/CONS but I feel that subtree is behind in the 
 race and the latest version of submodule work fine

Use whatever works fine for your use case.

My personal opinion/expectation is to see submodules
improving/advancing more than subtrees advancing in the near future.
Though this is neither the official nor a strong opinion.

Stefan


 Suggestions are very welcome.
 Thanks in advance,

 Jean Audibert


 _

 This message may contain confidential information and is intended for 
 specific recipients unless explicitly noted otherwise. If you have reason to 
 believe you are not an intended recipient of this message, please delete it 
 and notify the sender. This message may not represent the opinion of Euronext 
 N.V. or any of its subsidiaries or affiliates, and does not constitute a 
 contract or guarantee. Unencrypted electronic mail is not secure and the 
 recipient of this message is expected to provide safeguards from viruses and 
 pursue alternate means of communication where privacy or a binding message is 
 desired.

 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread David Turner
On Mon, 2015-06-22 at 22:36 -0700, Junio C Hamano wrote:
 David Turner dtur...@twopensource.com writes:
 
  I've revived and modified Ronnie Sahlberg's work on the refs db
  backend.  
 
  The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle.
  I recognize that there have been changes to the refs code since then,
  and that there are some further changes in-flight from e.g. Michael
  Haggerty.  If there is interest in this, I can rebase once Michael's
  changes land.
  ...
  The db backend runs git for-each-ref about 30% faster than the files
  backend with fully-packed refs on a repo with ~120k refs.  It's also
  about 4x faster than using fully-unpacked refs.  In addition, and
  perhaps more importantly, it avoids case-conflict issues on OS X.
 
  I chose to use LMDB for the database...
  ...
  Ronnie Sahlberg's original version of this patchset used tdb.  The
  advantage of tdb is that it's smaller (~125k).  The disadvantages are
  that tdb is hard to build on OS X.  It's also not in homebrew.  So lmdb
  seemed simpler.
 
 If there is interest?  Shut up and take my money ;-)
 
 More seriously, that's great that you stepped up to resurrect this
 topic.  In a sense, the choice of sample database backend does not
 matter.  I do not care if it is tdb, lmdb, or even Berkeley DB as
 long as it functions. ;-)
 
 As long as the interface between ref-transaction system on the Git
 side and the database backend is designed right, your lmdb thing can
 serve as a reference implementation for other people to plug other
 database backends to the same interface, right? 

Yes.

  As one step to
 validate the interface to the database backends, it would be nice to
 eventually have at least two backends that talk to meaningfully
 different systems, but we have to start somewhere, and for now we
 have lmdb is as good a place to start as any other db backend.
 
 I wonder if we can do a filesystem backend on top of the same
 backend interface---is that too much impedance mismatch to make it
 unpractical?

The patch series does include a filesystem backend, which is simply the
current ref infrastructure with extremely minor changes.  

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Untracked files when git status executed on a new folder

2015-06-23 Thread Johannes Löthberg

On 23/06, Víctor Martín Hernández wrote:

Hi all.
Today I've had an unexpected behaviour that I'm not sure if is a bug or
I'm not doing git best practices... (surely the latest...)
The sequence of actions is :

1. create a new subfolder of my local repository branch
2. cd to this new folder, and create a new file
3. execute git status from the new folder

Doing that, the new folder doesn't appear as untracked.

4. cd ..
5. git status
In this case, the new folder appears.

If I create a new folder on the same level that the new one created in
step 1, cd into it, and execute git status, the folder created in step 1
appears as untracked.



Can't reproduce on Git 2.4.4/Linux, which Git version and platform are 
you using?


--
Sincerely,
 Johannes Löthberg
 PGP Key ID: 0x50FB9B273A9D0BB5
 https://theos.kyriasis.com/~kyrias/


signature.asc
Description: PGP signature


[PATCH v7 2/5] bisect: replace hardcoded bad|good by variables

2015-06-23 Thread Matthieu Moy
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr

To add new tags like old/new and have keywords less confusing, the
first step is to avoid hardcoding the keywords.

The default mode is still bad/good.

Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr
Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr
Signed-off-by: Valentin Duperray valentin.duper...@ensimag.imag.fr
Signed-off-by: Franck Jonas franck.jo...@ensimag.imag.fr
Signed-off-by: Lucien Kong lucien.k...@ensimag.imag.fr
Signed-off-by: Thomas Nguy thomas.n...@ensimag.imag.fr
Signed-off-by: Huynh Khoi Nguyen Nguyen 
huynh-khoi-nguyen.ngu...@ensimag.imag.fr
Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr
Signed-off-by: Matthieu Moy matthieu@imag.fr
---
 bisect.c  | 51 ++-
 git-bisect.sh | 57 +++--
 2 files changed, 65 insertions(+), 43 deletions(-)
 mode change 100755 = 100644 git-bisect.sh

diff --git a/bisect.c b/bisect.c
index 5b8357d..2d3dbdc 100644
--- a/bisect.c
+++ b/bisect.c
@@ -21,6 +21,9 @@ static const char *argv_checkout[] = {checkout, -q, NULL, 
--, NULL};
 static const char *argv_show_branch[] = {show-branch, NULL, NULL};
 static const char *argv_update_ref[] = {update-ref, --no-deref, 
BISECT_HEAD, NULL, NULL};
 
+static const char *name_bad;
+static const char *name_good;
+
 /* Remember to update object flag allocation in object.h */
 #define COUNTED(1u16)
 
@@ -403,15 +406,21 @@ struct commit_list *find_bisection(struct commit_list 
*list,
 static int register_ref(const char *refname, const struct object_id *oid,
int flags, void *cb_data)
 {
-   if (!strcmp(refname, bad)) {
+   struct strbuf good_prefix = STRBUF_INIT;
+   strbuf_addstr(good_prefix, name_good);
+   strbuf_addstr(good_prefix, -);
+
+   if (!strcmp(refname, name_bad)) {
current_bad_oid = xmalloc(sizeof(*current_bad_oid));
oidcpy(current_bad_oid, oid);
-   } else if (starts_with(refname, good-)) {
+   } else if (starts_with(refname, good_prefix.buf)) {
sha1_array_append(good_revs, oid-hash);
} else if (starts_with(refname, skip-)) {
sha1_array_append(skipped_revs, oid-hash);
}
 
+   strbuf_release(good_prefix);
+
return 0;
 }
 
@@ -634,7 +643,7 @@ static void exit_if_skipped_commits(struct commit_list 
*tried,
return;
 
printf(There are only 'skip'ped commits left to test.\n
-  The first bad commit could be any of:\n);
+  The first %s commit could be any of:\n, name_bad);
print_commit_list(tried, %s\n, %s\n);
if (bad)
printf(%s\n, oid_to_hex(bad));
@@ -732,18 +741,21 @@ static void handle_bad_merge_base(void)
if (is_expected_rev(current_bad_oid)) {
char *bad_hex = oid_to_hex(current_bad_oid);
char *good_hex = join_sha1_array_hex(good_revs, ' ');
-
-   fprintf(stderr, The merge base %s is bad.\n
-   This means the bug has been fixed 
-   between %s and [%s].\n,
-   bad_hex, bad_hex, good_hex);
-
+   if (!strcmp(name_bad, bad)) {
+   fprintf(stderr, The merge base %s is bad.\n
+   This means the bug has been fixed 
+   between %s and [%s].\n,
+   bad_hex, bad_hex, good_hex);
+   } else {
+   die(BUG: terms %s/%s not managed, name_bad, 
name_good);
+   }
exit(3);
}
 
-   fprintf(stderr, Some good revs are not ancestor of the bad rev.\n
+   fprintf(stderr, Some %s revs are not ancestor of the %s rev.\n
git bisect cannot work properly in this case.\n
-   Maybe you mistook good and bad revs?\n);
+   Maybe you mistook %s and %s revs?\n,
+   name_good, name_bad, name_good, name_bad);
exit(1);
 }
 
@@ -755,10 +767,10 @@ static void handle_skipped_merge_base(const unsigned char 
*mb)
 
warning(the merge base between %s and [%s] 
must be skipped.\n
-   So we cannot be sure the first bad commit is 
+   So we cannot be sure the first %s commit is 
between %s and %s.\n
We continue anyway.,
-   bad_hex, good_hex, mb_hex, bad_hex);
+   bad_hex, good_hex, name_bad, mb_hex, bad_hex);
free(good_hex);
 }
 
@@ -839,7 +851,7 @@ static void check_good_are_ancestors_of_bad(const char 
*prefix, int no_checkout)
int fd;
 
if (!current_bad_oid)
-   die(a bad revision is needed);
+   die(a %s revision is needed, name_bad);
 
/* Check if file BISECT_ANCESTORS_OK exists. */
if 

[PATCH v7 5/5] bisect: allows any terms set by user

2015-06-23 Thread Matthieu Moy
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr

Introduction of the git bisect terms function.
The user can set its own terms.
It will work exactly like before. The terms must be set
before the start.

Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr
Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr
Signed-off-by: Matthieu Moy matthieu@imag.fr
---
 Documentation/git-bisect.txt | 19 +
 git-bisect.sh| 68 
 t/t6030-bisect-porcelain.sh  | 43 
 3 files changed, 125 insertions(+), 5 deletions(-)

diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
index 3c3021a..ef0c03c 100644
--- a/Documentation/git-bisect.txt
+++ b/Documentation/git-bisect.txt
@@ -133,6 +133,25 @@ You must run `git bisect start` without commits as 
argument and run
 `git bisect new rev`/`git bisect old rev...` after to add the
 commits.
 
+Alternative terms: use your own terms
+
+
+If the builtins terms bad/good and new/old do not satisfy you, you can
+set your own terms.
+
+
+git bisect terms term1 term2
+
+
+This command has to be used before a bisection has started.
+The term1 must be associated with the latest revisions and term2 with the
+ancestors of term1.
+
+Only the first bisection following the 'git bisect terms' will use the terms.
+If you mistyped one of the terms you can do again 'git bisect terms term1
+term2'.
+
+
 Bisect visualize
 
 
diff --git a/git-bisect.sh b/git-bisect.sh
index 73763a2..8ef2b94 100644
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -1,6 +1,6 @@
 #!/bin/sh
 
-USAGE='[help|start|bad|good|new|old|skip|next|reset|visualize|replay|log|run]'
+USAGE='[help|start|bad|good|new|old|terms|skip|next|reset|visualize|replay|log|run]'
 LONG_USAGE='git bisect help
print this long help message.
 git bisect start [--no-checkout] [bad [good...]] [--] [pathspec...]
@@ -11,6 +11,8 @@ git bisect (bad|new) [rev]
 git bisect (good|old) [rev...]
mark rev... known-good revisions/
revisions before change in a given property.
+git bisect terms term1 term2
+   set up term1 and term2 as bisection terms.
 git bisect skip [(rev|range)...]
mark rev... untestable revisions.
 git bisect next
@@ -82,6 +84,14 @@ bisect_start() {
# revision_seen is true if a git bisect start
# has revision as arguments
revision_seen=0
+   # terms_defined is used to detect if the user
+   # defined his own terms with git bisect terms
+   terms_defined=0
+   if test -s $GIT_DIR/TERMS_DEFINED
+   then
+   terms_defined=1
+   get_terms
+   fi
if test z$(git rev-parse --is-bare-repository) != zfalse
then
mode=--no-checkout
@@ -180,10 +190,14 @@ bisect_start() {
} 
git rev-parse --sq-quote $@ $GIT_DIR/BISECT_NAMES 
eval $eval true 
-   if test $revision_seen -eq 1  test ! -s $GIT_DIR/BISECT_TERMS
+   if test $revision_seen -eq 1  test ! -s $GIT_DIR/BISECT_TERMS || 
test $terms_defined -eq 1
then
echo $NAME_BAD $GIT_DIR/BISECT_TERMS 
-   echo $NAME_GOOD $GIT_DIR/BISECT_TERMS
+   echo $NAME_GOOD $GIT_DIR/BISECT_TERMS 
+   if test $terms_defined -eq 1
+   then
+   echo git bisect terms $NAME_BAD $NAME_GOOD 
$GIT_DIR/BISECT_LOG || exit
+   fi
fi 
echo git bisect start$orig_args $GIT_DIR/BISECT_LOG || exit
#
@@ -419,6 +433,7 @@ bisect_clean_state() {
rm -f $GIT_DIR/BISECT_NAMES 
rm -f $GIT_DIR/BISECT_RUN 
rm -f $GIT_DIR/BISECT_TERMS 
+   rm -f $GIT_DIR/TERMS_DEFINED 
# Cleanup head-name if it got left by an old version of git-bisect
rm -f $GIT_DIR/head-name 
git update-ref -d --no-deref BISECT_HEAD 
@@ -447,6 +462,8 @@ bisect_replay () {
eval $cmd ;;
$NAME_GOOD|$NAME_BAD|skip)
bisect_write $command $rev ;;
+   terms)
+   bisect_terms $rev ;;
*)
die $(gettext ?? what are you talking about?) ;;
esac
@@ -531,7 +548,8 @@ get_terms () {
 check_and_set_terms () {
cmd=$1
case $cmd in
-   bad|good|new|old)
+   skip|start|terms) ;;
+   *)
if test -s $GIT_DIR/BISECT_TERMS  test $cmd != 
$NAME_BAD  test $cmd != $NAME_GOOD
then
die $(eval_gettext Invalid command: you're currently 
in a \$NAME_BAD/\$NAME_GOOD bisect.)
@@ -564,6 +582,44 @@ bisect_voc () {
esac
 }
 
+bisect_terms () {
+   case $# in
+   0)
+   if test -s $GIT_DIR/BISECT_TERMS
+ 

[PATCH v7 1/5] bisect: correction of typo

2015-06-23 Thread Matthieu Moy
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr

Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr
Signed-off-by: Matthieu Moy matthieu@imag.fr
---
 bisect.c| 2 +-
 t/t6030-bisect-porcelain.sh | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/bisect.c b/bisect.c
index 03d5cd9..5b8357d 100644
--- a/bisect.c
+++ b/bisect.c
@@ -743,7 +743,7 @@ static void handle_bad_merge_base(void)
 
fprintf(stderr, Some good revs are not ancestor of the bad rev.\n
git bisect cannot work properly in this case.\n
-   Maybe you mistake good and bad revs?\n);
+   Maybe you mistook good and bad revs?\n);
exit(1);
 }
 
diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh
index 06b4868..9e2c203 100755
--- a/t/t6030-bisect-porcelain.sh
+++ b/t/t6030-bisect-porcelain.sh
@@ -362,7 +362,7 @@ test_expect_success 'bisect starting with a detached HEAD' '
 test_expect_success 'bisect errors out if bad and good are mistaken' '
git bisect reset 
test_must_fail git bisect start $HASH2 $HASH4 2 rev_list_error 
-   grep mistake good and bad rev_list_error 
+   grep mistook good and bad rev_list_error 
git bisect reset
 '
 
-- 
2.4.4.414.ge37915c

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v7 4/5] bisect: add the terms old/new

2015-06-23 Thread Matthieu Moy
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr

When not looking for a regression during a bisect but for a fix or a
change in another given property, it can be confusing to use 'good'
and 'bad'.

This patch introduce `git bisect new` and `git bisect old` as an
alternative to 'bad' and good': the commits which have a certain
property must be marked as `new` and the ones which do not as `old`.

The output will be the first commit after the change in the property.
During a new/old bisect session you cannot use bad/good commands and
vice-versa.

Some commands are still not available for old/new:
 * git rev-list --bisect does not treat the revs/bisect/new and
   revs/bisect/old-SHA1 files.

Old discussions:
- http://thread.gmane.org/gmane.comp.version-control.git/86063
introduced bisect fix unfixed to find fix.
- http://thread.gmane.org/gmane.comp.version-control.git/182398
discussion around bisect yes/no or old/new.
- http://thread.gmane.org/gmane.comp.version-control.git/199758
last discussion and reviews
New discussions:
- http://thread.gmane.org/gmane.comp.version-control.git/271320
( v2 1/7-4/7 )
- http://comments.gmane.org/gmane.comp.version-control.git/271343
( v2 5/7-7/7 )

Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr
Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr
Signed-off-by: Valentin Duperray valentin.duper...@ensimag.imag.fr
Signed-off-by: Franck Jonas franck.jo...@ensimag.imag.fr
Signed-off-by: Lucien Kong lucien.k...@ensimag.imag.fr
Signed-off-by: Thomas Nguy thomas.n...@ensimag.imag.fr
Signed-off-by: Huynh Khoi Nguyen Nguyen 
huynh-khoi-nguyen.ngu...@ensimag.imag.fr
Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr
Signed-off-by: Matthieu Moy matthieu@imag.fr
---
 Documentation/git-bisect.txt | 48 ++--
 bisect.c | 11 +++---
 git-bisect.sh| 30 ++-
 t/t6030-bisect-porcelain.sh  | 38 +++
 4 files changed, 112 insertions(+), 15 deletions(-)

diff --git a/Documentation/git-bisect.txt b/Documentation/git-bisect.txt
index 4cb52a7..3c3021a 100644
--- a/Documentation/git-bisect.txt
+++ b/Documentation/git-bisect.txt
@@ -18,8 +18,8 @@ on the subcommand:
 
  git bisect help
  git bisect start [--no-checkout] [bad [good...]] [--] [paths...]
- git bisect bad [rev]
- git bisect good [rev...]
+ git bisect (bad|new) [rev]
+ git bisect (good|old) [rev...]
  git bisect skip [(rev|range)...]
  git bisect reset [commit]
  git bisect visualize
@@ -104,6 +104,35 @@ For example, `git bisect reset HEAD` will leave you on the 
current
 bisection commit and avoid switching commits at all, while `git bisect
 reset bisect/bad` will check out the first bad revision.
 
+
+Alternative terms: bisect new and bisect old
+
+
+If you are not at ease with the terms bad and good, perhaps
+because you are looking for the commit that introduced a fix, you can
+alternatively use new and old instead.
+But note that you cannot mix bad and good with new and old.
+
+
+git bisect new [rev]
+
+
+Marks the commit as new, e.g. the bug is no longer there, if you are looking
+for a commit that fixed a bug, or the feature that used to work is now broken
+at this point, if you are looking for a commit that introduced a bug.
+It is the equivalent of git bisect bad [rev].
+
+
+git bisect old [rev...]
+
+
+Marks the commit as old, as the opposite of 'git bisect new'.
+It is the equivalent of git bisect good [rev...].
+
+You must run `git bisect start` without commits as argument and run
+`git bisect new rev`/`git bisect old rev...` after to add the
+commits.
+
 Bisect visualize
 
 
@@ -379,6 +408,21 @@ In this case, when 'git bisect run' finishes, bisect/bad 
will refer to a commit
 has at least one parent whose reachable graph is fully traversable in the sense
 required by 'git pack objects'.
 
+* Look for a fix instead of a regression in the code
++
+
+$ git bisect start
+$ git bisect new HEAD# current commit is marked as new
+$ git bisect old HEAD~10 # the tenth commit from now is marked as old
+
++
+Let's consider the last commit has a given property, and that we are looking
+for the commit which introduced this property. For each commit the bisection
+guide us to, we will test if the property is present. If it is we will mark
+the commit as new with 'git bisect new', otherwise we will mark it as old.
+At the end of the bisect session, the result will be the first new commit (e.g
+the first one with the property).
+
 
 SEE ALSO
 
diff 

[PATCH v7 3/5] bisect: simplify the addition of new bisect terms

2015-06-23 Thread Matthieu Moy
From: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr

We create a file BISECT_TERMS in the repository .git to be read during a
bisection. The fonctions to be changed if we add new terms are quite
few.
In git-bisect.sh :
check_and_set_terms
bisect_voc

Co-authored-by: Louis Stuber stub...@ensimag.grenoble-inp.fr
Tweaked-by: Matthieu Moy matthieu@grenoble-inp.fr
Signed-off-by: Antoine Delaite antoine.dela...@ensimag.grenoble-inp.fr
Signed-off-by: Louis Stuber stub...@ensimag.grenoble-inp.fr
Signed-off-by: Valentin Duperray valentin.duper...@ensimag.imag.fr
Signed-off-by: Franck Jonas franck.jo...@ensimag.imag.fr
Signed-off-by: Lucien Kong lucien.k...@ensimag.imag.fr
Signed-off-by: Thomas Nguy thomas.n...@ensimag.imag.fr
Signed-off-by: Huynh Khoi Nguyen Nguyen 
huynh-khoi-nguyen.ngu...@ensimag.imag.fr
Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr
Signed-off-by: Matthieu Moy matthieu@imag.fr
---
 bisect.c  | 38 +---
 git-bisect.sh | 70 +--
 revision.c| 26 --
 3 files changed, 122 insertions(+), 12 deletions(-)

diff --git a/bisect.c b/bisect.c
index 2d3dbdc..08be634 100644
--- a/bisect.c
+++ b/bisect.c
@@ -747,7 +747,10 @@ static void handle_bad_merge_base(void)
between %s and [%s].\n,
bad_hex, bad_hex, good_hex);
} else {
-   die(BUG: terms %s/%s not managed, name_bad, 
name_good);
+   fprintf(stderr, The merge base %s is %s.\n
+   This means the first commit marked %s is 
+   between %s and [%s].\n,
+   bad_hex, name_bad, name_bad, bad_hex, good_hex);
}
exit(3);
}
@@ -902,6 +905,36 @@ static void show_diff_tree(const char *prefix, struct 
commit *commit)
 }
 
 /*
+ * The terms used for this bisect session are stored in BISECT_TERMS.
+ * We read them and store them to adapt the messages accordingly.
+ * Default is bad/good.
+ */
+void read_bisect_terms(const char **read_bad, const char **read_good)
+{
+   struct strbuf str = STRBUF_INIT;
+   const char *filename = git_path(BISECT_TERMS);
+   FILE *fp = fopen(filename, r);
+
+   if (!fp) {
+   if (errno == ENOENT) {
+   *read_bad = bad;
+   *read_good = good;
+   return;
+   } else {
+   die(could not read file '%s': %s, filename,
+   strerror(errno));
+   }
+   } else {
+   strbuf_getline(str, fp, '\n');
+   *read_bad = strbuf_detach(str, NULL);
+   strbuf_getline(str, fp, '\n');
+   *read_good = strbuf_detach(str, NULL);
+   }
+   strbuf_release(str);
+   fclose(fp);
+}
+
+/*
  * We use the convention that exiting with an exit code 10 means that
  * the bisection process finished successfully.
  * In this case the calling shell script should exit 0.
@@ -917,8 +950,7 @@ int bisect_next_all(const char *prefix, int no_checkout)
const unsigned char *bisect_rev;
char bisect_rev_hex[GIT_SHA1_HEXSZ + 1];
 
-   name_bad = bad;
-   name_good = good;
+   read_bisect_terms(name_bad, name_good);
if (read_bisect_refs())
die(reading bisect refs failed);
 
diff --git a/git-bisect.sh b/git-bisect.sh
index ce6412f..7bb18db 100644
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -77,6 +77,9 @@ bisect_start() {
orig_args=$(git rev-parse --sq-quote $@)
bad_seen=0
eval=''
+   # revision_seen is true if a git bisect start
+   # has revision as arguments
+   revision_seen=0
if test z$(git rev-parse --is-bare-repository) != zfalse
then
mode=--no-checkout
@@ -101,6 +104,9 @@ bisect_start() {
die $(eval_gettext '\$arg' does not appear to 
be a valid revision)
break
}
+
+   revision_seen=1
+
case $bad_seen in
0) state=$NAME_BAD ; bad_seen=1 ;;
*) state=$NAME_GOOD ;;
@@ -172,6 +178,11 @@ bisect_start() {
} 
git rev-parse --sq-quote $@ $GIT_DIR/BISECT_NAMES 
eval $eval true 
+   if test $revision_seen -eq 1  test ! -s $GIT_DIR/BISECT_TERMS
+   then
+   echo $NAME_BAD $GIT_DIR/BISECT_TERMS 
+   echo $NAME_GOOD $GIT_DIR/BISECT_TERMS
+   fi 
echo git bisect start$orig_args $GIT_DIR/BISECT_LOG || exit
#
# Check if we can proceed to the next bisect state.
@@ -232,6 +243,7 @@ bisect_skip() {
 bisect_state() {
bisect_autostart
state=$1
+   check_and_set_terms $state
case $#,$state 

[PATCH v7 0/5] git bisect old/new

2015-06-23 Thread Matthieu Moy
Hi,

I fixed a few minor issues in v6. Patch between my version and v6 is
below. I also pushed my branch here:

  https://github.com/moy/git/tree/bisect-terms

Not visible in the patch below: I squashed PATCH 5 into PATCH 3 to
avoid the pattern break stuff and then repair it.

The first two patches seem ready.

PATCH 4 (add old/new) is still buggy. When starting a bisection with

  git bisect start $old $new

the command git bisect visualize does not work (it shows no commit).

I consider PATCH 5 as WIP, I think it would need a lot of polishing
and testing to be mergeable. I think a reasonable objective for now it
to get old/new working in the user-interface, and drop PATCH 5 from
the series when it gets merged. The existance of PATCH 5 is a very
good thing even if it doesn't get merged:

* The fact that it's possible to do it on top of the series shows that
  we make the code more generic. I think it's important that
  regardless of features, the code moves in the right direction.

* The patch can be taken over later by someone else.

diff --git a/bisect.c b/bisect.c
index 7492fdc..ab09650 100644
--- a/bisect.c
+++ b/bisect.c
@@ -921,7 +921,7 @@ void read_bisect_terms(const char **read_bad, const char 
**read_good)
FILE *fp = fopen(filename, r);
 
if (!fp) {
-   if (errno==2) {
+   if (errno == ENOENT) {
*read_bad = bad;
*read_good = good;
return;
diff --git a/git-bisect.sh b/git-bisect.sh
index 7da22b1..8ef2b94 100644
--- a/git-bisect.sh
+++ b/git-bisect.sh
@@ -541,7 +541,7 @@ get_terms () {
{
read NAME_BAD
read NAME_GOOD
-   }$GIT_DIR/BISECT_TERMS   
+   } $GIT_DIR/BISECT_TERMS
fi
 }
 
@@ -605,8 +605,8 @@ bisect_terms () {
echo 1 $GIT_DIR/TERMS_DEFINED
echo git bisect terms $NAME_BAD $NAME_GOOD 
$GIT_DIR/BISECT_LOG || exit
else
-   die $(gettext A bisection has already started, and 
you can't change terms in the middle of it. 
-Use 'git bisect terms' to see the current terms. 
+   die $(gettext A bisection has already started, and 
you can't change terms in the middle of it.
+Use 'git bisect terms' to see the current terms.
 Otherwise, to start a new bisection with new terms, please use 'git bisect 
reset' and set the terms before the start)
fi ;;
*)
diff --git a/revision.c b/revision.c
index 27750ac..f22923f 100644
--- a/revision.c
+++ b/revision.c
@@ -2083,18 +2083,28 @@ extern void read_bisect_terms(const char **bad, const 
char **good);
 
 static int for_each_bad_bisect_ref(const char *submodule, each_ref_fn fn, void 
*cb_data)
 {
-   char bisect_refs_path[256];
-   strcpy(bisect_refs_path, refs/bisect/);
-   strcat(bisect_refs_path, name_bad);
-   return for_each_ref_in_submodule(submodule, bisect_refs_path, fn, 
cb_data);
+   struct strbuf bisect_refs_buf = STRBUF_INIT;
+   const char *bisect_refs_str;
+   int status;
+   strbuf_addstr(bisect_refs_buf, refs/bisect/);
+   strbuf_addstr(bisect_refs_buf, name_bad);
+   bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL);
+   status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, 
cb_data);
+   free((char *)bisect_refs_str);
+   return status;
 }
 
 static int for_each_good_bisect_ref(const char *submodule, each_ref_fn fn, 
void *cb_data)
 {
-   char bisect_refs_path[256];
-   strcpy(bisect_refs_path, refs/bisect/);
-   strcat(bisect_refs_path, name_good);
-   return for_each_ref_in_submodule(submodule, bisect_refs_path, fn, 
cb_data);
+   struct strbuf bisect_refs_buf = STRBUF_INIT;
+   const char *bisect_refs_str;
+   int status;
+   strbuf_addstr(bisect_refs_buf, refs/bisect/);
+   strbuf_addstr(bisect_refs_buf, name_bad);
+   bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL);
+   status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, 
cb_data);
+   free((char *)bisect_refs_str);
+   return status;
 }
 
 static int handle_revision_pseudo_opt(const char *submodule,
diff --git a/t/t6030-bisect-porcelain.sh b/t/t6030-bisect-porcelain.sh
index d91116e..289dbb0 100755
--- a/t/t6030-bisect-porcelain.sh
+++ b/t/t6030-bisect-porcelain.sh
@@ -781,12 +781,12 @@ test_expect_success 'bisect start with one new and old' '
git bisect new 
git bisect new bisect_result 
grep $HASH2 is the first new commit bisect_result 
-   git bisect log  log_to_replay.txt 
+   git bisect log log_to_replay.txt 
git bisect reset
 '
 
 test_expect_success 'bisect replay with old and new' '
-   git bisect replay log_to_replay.txt  bisect_result 
+   git bisect replay log_to_replay.txt bisect_result 
grep $HASH2 is the first new commit bisect_result 
git bisect reset
 '

Re: [PATCH] apply: fix adding new files on i-t-a entries

2015-06-23 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

  I think I'm opening a can of worms with d95d728

 I cannot offhand convince myself that apply is the only casualty;
 assuming it is, I think a reasonable way forward is to keep d95d728
 and adjust apply to the new world order.  Otherwise, i.e. if there
 are wider fallouts from d95d728, we may instead want to temporarily
 revert it off from 'master', deal with fallouts to apply and other
 things, before resurrecting it.

 Anything that internally uses diff-index is suspect, I think.

 What do others think?  You seem to ...

  So far blame, rm and checkout-entry and checkout paths are on my
  to-think-or-fix list. But this patch can get in early to fix a real
  regression instead of waiting for one big series. A lot more
  discussions will be had before that series gets in good shape.

 ... think that the damage could be quite extensive, so I am inclined
 to say that we first revert d95d728 before going forward.

Let's do this on 'nd/diff-i-t-a' topic and merge the result
immediately to 'master' for now.

-- 8 --
From: Junio C Hamano gits...@pobox.com
Date: Tue, 23 Jun 2015 10:27:47 -0700
Subject: [PATCH] Revert diff-lib.c: adjust position of i-t-a entries in diff

This reverts commit d95d728aba06a34394d15466045cbdabdada58a2.

It turns out that many other commands that need to interact with the
result of running diff-files and diff-index, e.g.  git apply, git
rm, etc., need to be adjusted to the new world order it brings in.
For example, it would break this sequence to correct a whitespace
breakage in the parts you changed:

git add -N file
git diff --cached file | git apply --cached --whitespace=fix
git checkout file

In the old world order, diff showed a patch to modify an existing
empty file by adding its full contents, and apply updated the
index by modifying the existing empty blob (which is what an
Intent-to-Add entry records in the index) with that patch.

In the new world order, diff shows a patch to create a new file
with its full contents, but because apply thinks that the i-t-a
entry already exists in the index, it refused to accept a creation.

Adjusting apply to this new world order is easy, but we need to
assess the extent of the damage to the rest of the system the new
world order brought in before going forward and adjust them all,
after which we can resurrect the commit being reverted here.
---
 builtin/add.c   |  1 -
 diff-lib.c  | 12 
 t/t2203-add-intent.sh   | 23 ---
 t/t4011-diff-symlink.sh | 10 --
 4 files changed, 8 insertions(+), 38 deletions(-)

diff --git a/builtin/add.c b/builtin/add.c
index ee370b0..3390933 100644
--- a/builtin/add.c
+++ b/builtin/add.c
@@ -63,7 +63,6 @@ static void update_callback(struct diff_queue_struct *q,
switch (fix_unmerged_status(p, data)) {
default:
die(_(unexpected diff status %c), p-status);
-   case DIFF_STATUS_ADDED:
case DIFF_STATUS_MODIFIED:
case DIFF_STATUS_TYPE_CHANGED:
if (add_file_to_index(the_index, path, data-flags)) {
diff --git a/diff-lib.c b/diff-lib.c
index 714501a..a85c497 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -212,11 +212,6 @@ int run_diff_files(struct rev_info *revs, unsigned int 
option)
   ce-sha1, 
!is_null_sha1(ce-sha1),
   ce-name, 0);
continue;
-   } else if (ce-ce_flags  CE_INTENT_TO_ADD) {
-   diff_addremove(revs-diffopt, '+', ce-ce_mode,
-  EMPTY_BLOB_SHA1_BIN, 0,
-  ce-name, 0);
-   continue;
}
 
changed = match_stat_with_submodule(revs-diffopt, ce, 
st,
@@ -381,13 +376,6 @@ static void do_oneway_diff(struct unpack_trees_options *o,
struct rev_info *revs = o-unpack_data;
int match_missing, cached;
 
-   /* i-t-a entries do not actually exist in the index */
-   if (idx  (idx-ce_flags  CE_INTENT_TO_ADD)) {
-   idx = NULL;
-   if (!tree)
-   return; /* nothing to diff.. */
-   }
-
/* if the entry is not checked out, don't examine work tree */
cached = o-index_only ||
(idx  ((idx-ce_flags  CE_VALID) || ce_skip_worktree(idx)));
diff --git a/t/t2203-add-intent.sh b/t/t2203-add-intent.sh
index 7c641bf..2a4a749 100755
--- a/t/t2203-add-intent.sh
+++ b/t/t2203-add-intent.sh
@@ -5,24 +5,10 @@ test_description='Intent to add'
 . ./test-lib.sh
 
 test_expect_success 'intent to add' '
-   test_commit 1 
-   git rm 1.t 
-   echo hello 1.t 
echo hello file 
echo hello elif 
git add -N file 
-   git add elif 
-   

Re: [PATCH v7 3/5] bisect: simplify the addition of new bisect terms

2015-06-23 Thread Eric Sunshine
On Tue, Jun 23, 2015 at 8:54 AM, Matthieu Moy matthieu@imag.fr wrote:
 diff --git a/revision.c b/revision.c
 index 3ff8723..f22923f 100644
 --- a/revision.c
 +++ b/revision.c
 @@ -2076,14 +2079,32 @@ void parse_revision_opt(struct rev_info *revs, struct 
 parse_opt_ctx_t *ctx,
 ctx-argc -= n;
  }

 +extern void read_bisect_terms(const char **bad, const char **good);
 +
  static int for_each_bad_bisect_ref(const char *submodule, each_ref_fn fn, 
 void *cb_data)
  {
 -   return for_each_ref_in_submodule(submodule, refs/bisect/bad, fn, 
 cb_data);
 +   struct strbuf bisect_refs_buf = STRBUF_INIT;
 +   const char *bisect_refs_str;
 +   int status;
 +   strbuf_addstr(bisect_refs_buf, refs/bisect/);
 +   strbuf_addstr(bisect_refs_buf, name_bad);

A single strbuf_addf() rather than two strbuf_addstr()s?

 +   bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL);
 +   status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, 
 cb_data);
 +   free((char *)bisect_refs_str);

Why the above rather than the simpler?

strbuf_addstr(bisect_refs, ...);
status = for_each_ref_in_submodule(submodule, bisect_refs.buf, fn, cb_data);
strbuf_release(bisect_refs);

What am I missing?

 +   return status;
  }

  static int for_each_good_bisect_ref(const char *submodule, each_ref_fn fn, 
 void *cb_data)
  {
 -   return for_each_ref_in_submodule(submodule, refs/bisect/good, fn, 
 cb_data);
 +   struct strbuf bisect_refs_buf = STRBUF_INIT;
 +   const char *bisect_refs_str;
 +   int status;
 +   strbuf_addstr(bisect_refs_buf, refs/bisect/);
 +   strbuf_addstr(bisect_refs_buf, name_bad);
 +   bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL);
 +   status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, 
 cb_data);
 +   free((char *)bisect_refs_str);

Ditto.

 +   return status;
  }
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread Stefan Beller
[+ronniesahlb...@gmail.com, FYI]

On Mon, Jun 22, 2015 at 5:50 PM, David Turner dtur...@twopensource.com wrote:
 I've revived and modified Ronnie Sahlberg's work on the refs db
 backend.

Awesome!


 The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle.
 I recognize that there have been changes to the refs code since then,
 and that there are some further changes in-flight from e.g. Michael
 Haggerty.  If there is interest in this, I can rebase once Michael's
 changes land.

Originally I wanted to continue on Ronnies work, but because of the churn
in refs I stopped it for a while and took care of other projects (and wanted
to come back eventually). Thanks for reviving this topic!

 The changes can be found here:
 https://github.com/dturner-tw/git.git on the dturner/pluggable-backends
 branch

 The db backend code was added in the penultimate commit; the rest is
 just code rearrangement and minor changes to make alternate backends
 possible.  There ended up being a fair amount of this rearrangement, but
 the end result is that almost the entire git test suite runs under the
 db backend without error (see below for details).

Looking at the end result in refs-be-db.c it feels like there are more
functions in the refs_be_db struct, did this originate from other design
choices? IIRC Ronnie wanted to have as least functions in there as
possible, and share as much of the code between the databases, such
that the glue between the db and the refs code is minimal.

Some random comments from looking over the branch briefly:

In the latest commit, (refs: tests for db backend), I am unsure about the
copyright annotations. At least a sole Copyright (c) 2007 Junio C Hamano
doesn't make sense to me. ;)

Typo in commit message bisect: use refs insfrastructure for BISECT_START

Some commits contain a ChangeId, which is a Gerrit leftover. :(

Thanks,
Stefan


 The db backend runs git for-each-ref about 30% faster than the files
 backend with fully-packed refs on a repo with ~120k refs.  It's also
 about 4x faster than using fully-unpacked refs.  In addition, and
 perhaps more importantly, it avoids case-conflict issues on OS X.

 I chose to use LMDB for the database.  LMDB has a few features that make
 it suitable for usage in git:

 1. It is relatively lightweight; it requires only one header file, and
 the library itself is under 300k (as opposed to 700k for
 e.g. sqlite).

 2. It is well-tested: it's been used in OpenLDAP for years.

 3. It's very fast.  LMDB's benchmarks show that it is among
 the fastest key-value stores.

 4. It has a relatively simple concurrency story; readers don't
 block writers and writers don't block readers.

 Ronnie Sahlberg's original version of this patchset used tdb.  The
 advantage of tdb is that it's smaller (~125k).  The disadvantages are
 that tdb is hard to build on OS X.  It's also not in homebrew.  So lmdb
 seemed simpler.

 To test this backend's correctness, I hacked test-lib.sh and
 test-lib-functions.sh to run all tests under the refs backend. Dozens
 of tests use manual ref/reflog reading/writing, or create submodules
 without passing --refs-backend-type to git init.  If those tests are
 changed to use the update-ref machinery or test-refs-be-db (or, in the
 case of packed-refs, corrupt refs, and dumb fetch tests, are skipped),
 the only remaining failing tests are the git-new-workdir tests and the
 gitweb tests.

 Please let me know how it would be best to proceed.

 --
 To unsubscribe from this list: send the line unsubscribe git in
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread David Turner
On Tue, 2015-06-23 at 07:47 -0400, Jeff King wrote:
 On Mon, Jun 22, 2015 at 08:50:56PM -0400, David Turner wrote:
 
  The db backend runs git for-each-ref about 30% faster than the files
  backend with fully-packed refs on a repo with ~120k refs.  It's also
  about 4x faster than using fully-unpacked refs.  In addition, and
  perhaps more importantly, it avoids case-conflict issues on OS X.
 
 Neat.
 
 Can you describe a bit more about the reflog handling?
 
 One of the problems we've had with large-ref repos is that the reflog
 storage is quite inefficient. You can pack all the refs, but you may
 still be stuck with a bunch of reflog files with one entry, wasting a
 whole inode. Doing a git repack when you have a million of those has
 horrible cold-cache performance. Basically anything that isn't
 one-file-per-reflog would be a welcome change. :)

Reflogs are stored in the database as well.  There is one header entry
per ref to indicate that a reflog is present, and then one database
entry per reflog entry; the entries are stored consecutively and
immediately following the header so that it's fast to iterate over them.

 It has also been a dream of mine to stop tying the reflogs specifically
 to the refs. I.e., have a spot for reflogs of branches that no longer
 exist, which allows us to retain them for deleted branches. Then you can
 possibly recover from a branch deletion, whereas now you have to dig
 through git fsck's dangling output. And the reflog, if you don't
 expire it, becomes a suitable audit log to find out what happened to
 each branch when (whereas now it is full of holes when things get
 deleted).

That would be cool, and I don't think it would be hard to add to my
current code; we could simply replace the header with a tombstone.
But I would prefer to wait until the series is merged; then we can build
on top of it.

 I dunno. Maybe I am overthinking it. But it really feels like the _refs_
 are a key/value thing, but the _reflogs_ are not. You can cram them into
 a key/value store, but you're probably operating on them as a big blob,
 then.

Reflogs are, conceptually, queues. I agree that a raw key-value store is
not a good way to store queues, but a B-Tree is not so terrible, since
it offers relatively fast iteration (amortized constant time IIRC).

  I chose to use LMDB for the database.  LMDB has a few features that make
  it suitable for usage in git:
 
 One of the complaints that Shawn had about sqlite is that there is no
 native Java implementation, which makes it hard for JGit to ship a
 compatible backend. I suspect the same is true for LMDB, but it is
 probably a lot simpler than sqlite (so reimplementation might be
 possible).
 
 But it may also be worth going with a slightly slower database if we can
 get wider compatibility for free.

There's a JNI interface to LMDB, which is, of course, not native.  I
don't think it would be too hard to entirely rewrite LMDB in Java, but
I'm not going to have time to do it for the forseeable future.  I've
asked Howard Chu if he knows of any efforts in progress.

  To test this backend's correctness, I hacked test-lib.sh and
  test-lib-functions.sh to run all tests under the refs backend. Dozens
  of tests use manual ref/reflog reading/writing, or create submodules
  without passing --refs-backend-type to git init.  If those tests are
  changed to use the update-ref machinery or test-refs-be-db (or, in the
  case of packed-refs, corrupt refs, and dumb fetch tests, are skipped),
  the only remaining failing tests are the git-new-workdir tests and the
  gitweb tests.
 
 I think we'll need to bump core.repositoryformatversion, too. See the
 patches I just posted here:
 
   http://thread.gmane.org/gmane.comp.version-control.git/272447

Thanks, that's valuable.  For the refs backend, opening the LMDB
database for writing is sufficient to block other writers.  Do you think
it would be valuable to provide a git hold-ref-lock command that simply
reads refs from stdin and keeps them locked until it reads EOF from
stdin?  That would allow cross-backend ref locking. 

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 3/5] bisect: simplify the addition of new bisect terms

2015-06-23 Thread Matthieu Moy
Eric Sunshine sunsh...@sunshineco.com writes:

 On Tue, Jun 23, 2015 at 8:54 AM, Matthieu Moy matthieu@imag.fr wrote:

 +   strbuf_addstr(bisect_refs_buf, refs/bisect/);
 +   strbuf_addstr(bisect_refs_buf, name_bad);

 A single strbuf_addf() rather than two strbuf_addstr()s?

 +   bisect_refs_str = strbuf_detach(bisect_refs_buf, NULL);
 +   status = for_each_ref_in_submodule(submodule, bisect_refs_str, fn, 
 cb_data);
 +   free((char *)bisect_refs_str);

 Why the above rather than the simpler?

 strbuf_addstr(bisect_refs, ...);
 status = for_each_ref_in_submodule(submodule, bisect_refs.buf, fn, 
 cb_data);
 strbuf_release(bisect_refs);

 What am I missing?

Indeed, your version is much better.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Eric Sunshine
On Mon, Jun 22, 2015 at 5:42 PM, Galan Rémi
remi.galan-alfo...@ensimag.grenoble-inp.fr wrote:
 Instead of removing a line to remove the commit, you can use the
 command drop (just like pick or edit). It has the same effect as
 deleting the line (removing the commit) except that you keep a visual
 trace of your actions, allowing a better control and reducing the
 possibility of removing a commit by mistake.

 Signed-off-by: Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr
 ---
 diff --git a/t/t3404-rebase-interactive.sh b/t/t3404-rebase-interactive.sh
 index ac429a0..ecd277c 100755
 --- a/t/t3404-rebase-interactive.sh
 +++ b/t/t3404-rebase-interactive.sh
 @@ -1102,4 +1102,20 @@ test_expect_success 'rebase -i commits that overwrite 
 untracked files (no ff)' '
 test $(git cat-file commit HEAD | sed -ne \$p) = I
  '

 +test_rebase_end () {
 +   test_when_finished git checkout master 
 +   git branch -D $1 
 +   test_might_fail git rebase --abort 
 +   git checkout -b $1 master
 +}

The way this is indented makes it difficult to see that lines 2 and 3
are continuations of 1. Perhaps format it like this instead?

test_rebase_end () {
test_when_finished git checkout master 
git branch -D $1 
test_might_fail git rebase --abort 
git checkout -b $1 master
}

 +
 +test_expect_success 'drop' '
 +   test_rebase_end dropTest 
 +   set_fake_editor 
 +   FAKE_LINES=1 drop 2 3 drop 4 5 git rebase -i --root 
 +   test E = $(git cat-file commit HEAD | sed -ne \$p) 
 +   test C = $(git cat-file commit HEAD^ | sed -ne \$p) 
 +   test A = $(git cat-file commit HEAD^^ | sed -ne \$p)
 +'
 +
  test_done
 --
 2.4.3.371.g8992f2a
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread David Turner
On Tue, 2015-06-23 at 17:23 +0700, Duy Nguyen wrote:
 On Tue, Jun 23, 2015 at 7:50 AM, David Turner
dtur...@twopensource.com wrote:
  To test this backend's correctness, I hacked test-lib.sh and
  test-lib-functions.sh to run all tests under the refs backend.
 
 Now we have two. split-index also benefits from running through full
 test suite like this. I propose we make make test run the test suite
 twice. The first run is with default configuration, no split index, no
 fancy ref backend. The second run enables split-index and switches to
 new backend, running through all test cases. In future we can also
 enable packv4 in this second run. There won't be a third run.
 
 When the second ref backend comes, we can switch between the two
 backends using a random number generator where we control both
 algorithm and seed, so that when a test fails, the user can give us
 their seed and we can re-run with the same configuration.

I'm not in love with this idea, because it makes it hard to do
exhaustive testing efficiently.  I would rather have make test run
through all tests under all combinations -- or at least all relevant
tests.  We could perhaps mark tests with a list of features that they
exercise, so that we don't have to run e.g. t8xxx with alternate refs
backends.  

 Dozens of tests use manual ref/reflog reading/writing, or create
submodules
  without passing --refs-backend-type to git init.  If those tests are
  changed to use the update-ref machinery or test-refs-be-db (or, in
the
  case of packed-refs, corrupt refs, and dumb fetch tests, are
skipped),
  the only remaining failing tests are the git-new-workdir tests and
the
  gitweb tests.
 
 I haven't read the series, but I guess you should also add a few tests
 to run on the first run, so new code is exercised a bit even if people
 skip the second run.

I did this already, yes.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Remi Galan Alfonso
Eric Sunshine sunsh...@sunshineco.com writes:
  +test_rebase_end () {
  +   test_when_finished git checkout master 
  +   git branch -D $1 
  +   test_might_fail git rebase --abort 
  +   git checkout -b $1 master
  +}
 
 The way this is indented makes it difficult to see that lines 2 and 3
 are continuations of 1. Perhaps format it like this instead?
 
 test_rebase_end () {
 test_when_finished git checkout master 
 git branch -D $1 
 test_might_fail git rebase --abort 
 git checkout -b $1 master
 }

I completely agree with you, moreover it was indented like this before.
I'll change it in my local version for now.

Ironically, it was modified after the following:

Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes:
 Eric Sunshine sunsh...@sunshineco.com writes:
   +test_expect_success 'rebase -i respects 
   rebase.missingCommitsCheck=ignore' '
   +   test_config rebase.missingCommitsCheck ignore 
   +   test_when_finished git checkout master 
   +   git branch -D tmp2 
 
  Strange indentation.
 
 Considering that 'git branch -D tmp2' is a part of test_when_finished,
 I wasn't sure of how it was supposed to be indented, so I did it this
 way to show that it was still within test_when_finished and not a
 separate command.
  test_when_finished git checkout master 
  git branch -D tmp2 
 Doesn't seem as clear, especially if you quickly read the lines.
 
 For now, I have removed the tab.

:p

Thanks,
Rémi
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Junio C Hamano
Galan Rémi  remi.galan-alfo...@ensimag.grenoble-inp.fr writes:

 +test_rebase_end () {
 + test_when_finished git checkout master 
 + git branch -D $1 

Is this one guaranteed to succeed?  Do we want to consider it a
failure to remove $1 (e.g. dropTest)?

$ git branch -D no-such-branch ; echo $?
error: branch 'no-such-branch' not found.
1

If dropTest branch did not exist before the test that begins with
a call to this function, what happens?

Besides, a function that must be called at the beginning of a test
piece has a name that ends with _end?  That sounds funny, no?

 + test_might_fail git rebase --abort 
 + git checkout -b $1 master
 +}

I'm wondering if this is not sufficient.

test_rebase_i_drop_prepare () {
git reset --hard 
git checkout -B $1 master
}

I am guessing that you named _end because it has when_finished, but
as far as I can tell, even after these three patches, the tests do
not really rely on the fact that it is on 'master' branch.  More
importantly, just being on 'master' branch is not a sufficient
cleanliness for the next test (and that is why you added these
branch -D and might-fail rebase --abort to this function in the
first place), it seems.  So...

 +test_expect_success 'drop' '
 + test_rebase_end dropTest 
 + set_fake_editor 
 + FAKE_LINES=1 drop 2 3 drop 4 5 git rebase -i --root 
 + test E = $(git cat-file commit HEAD | sed -ne \$p) 
 + test C = $(git cat-file commit HEAD^ | sed -ne \$p) 
 + test A = $(git cat-file commit HEAD^^ | sed -ne \$p)
 +'
 +
  test_done
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Remi Galan Alfonso
Matthieu Moy matthieu@grenoble-inp.fr writes:
 Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes:
 
  Eric Sunshine sunsh...@sunshineco.com writes:
   +test_rebase_end () {
   +   test_when_finished git checkout master 
   +   git branch -D $1 
   +   test_might_fail git rebase --abort 
   +   git checkout -b $1 master
   +}
 
  The way this is indented makes it difficult to see that lines 2 and 3
  are continuations of 1. Perhaps format it like this instead?
 
  test_rebase_end () {
  test_when_finished git checkout master 
  git branch -D $1 
  test_might_fail git rebase --abort 
  git checkout -b $1 master
  }
 
  I completely agree with you, moreover it was indented like this before.
  I'll change it in my local version for now.
 
 Perhaps to avoid confusion, stg like:
 
 test_when_finished 
 ... 
 ...
  
 git checkout
 
 (the closing  alone on its line)

I think that the indentation on its own is enough to avoid confusion
 test_rebase_end () {
   test_when_finished git checkout master 
   git branch -D $1 
   test_might_fail git rebase --abort 
   git checkout -b $1 master
 }
but your idea is fine as well, so I'm ok with either way.

Thanks,
Rémi
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 3/3] git rebase -i: add static check for commands and SHA-1

2015-06-23 Thread Junio C Hamano
Galan Rémi  remi.galan-alfo...@ensimag.grenoble-inp.fr writes:

  I used:
read -r command sha1 rest EOF
$line
EOF
  because
printf '%s' $line | read -r command sha1 rest
  doesn't work (the 3 variables have no value as a result).
  There might be a better way to do this, but I don't have it right now.

while read line
do
(
IFS=' '
set x $line
shift
# now $1 is your command, $2 is sha1, $3 is remainder
...
)
done

perhaps?

But more importantly, why do you even need to keep the bad ones in a
separate .badcmd and .badsha files?  Isn't that bloating your changes
unnecessarily, iow, if you issued your warning as you encounter them,
wouldn't the change become cleaner and easier to understand (and as
a side effect it may even become smaller)?  The _only_ thing that
you would get by keeping them in temporary files is that you can do
one header and bunch of errors, but is it so common to make a bad
edit to the insn sheet that a sequence of errors, one per line
becomes more burdensome to the end user?

I would think

stripspace |
while read -r command sha1 rest
do
...

and showing the warning as you detect inside that loop would be
sufficient.  Perhaps I am missing subtle details of what you are
doing.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Matthieu Moy
Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes:

 Eric Sunshine sunsh...@sunshineco.com writes:
  +test_rebase_end () {
  +   test_when_finished git checkout master 
  +   git branch -D $1 
  +   test_might_fail git rebase --abort 
  +   git checkout -b $1 master
  +}
 
 The way this is indented makes it difficult to see that lines 2 and 3
 are continuations of 1. Perhaps format it like this instead?
 
 test_rebase_end () {
 test_when_finished git checkout master 
 git branch -D $1 
 test_might_fail git rebase --abort 
 git checkout -b $1 master
 }

 I completely agree with you, moreover it was indented like this before.
 I'll change it in my local version for now.

Perhaps to avoid confusion, stg like:

test_when_finished 
... 
...
 
git checkout

(the closing  alone on its line)

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 4/5] bisect: add the terms old/new

2015-06-23 Thread Remi Galan Alfonso
Matthieu Moy matthieu@imag.fr writes:
 Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr
 Signed-off-by: Matthieu Moy matthieu@imag.fr

Sounds like you went all out with this patch.

Rémi
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Dependency Management

2015-06-23 Thread Josh Hagins
If neither git-submodule nor git-subtree is palatable to you, here are
a couple of alternatives you might try:

  * https://github.com/ingydotnet/git-subrepo
  * https://github.com/tdd/git-stree

On Tue, Jun 23, 2015 at 1:36 PM Stefan Beller sbel...@google.com wrote:

 On Tue, Jun 23, 2015 at 1:52 AM, Jean Audibert jaudib...@euronext.com wrote:
  Hi,
 
  Sorry to bother you with this question but I can't find any official 
  answer or strong opinion from Git community.
 
  In my company we recently started to use Git and we wonder how to share 
  code and manage dependencies with Git?
  Use case: in project P we need to include lib-a and lib-b (libraries shared 
  by several projects)
 
  In your opinion, what is the future proof solution?
  * Use submodule
  * Use subtree
 
  We know there is lot of PRO/CONS but I feel that subtree is behind in the 
  race and the latest version of submodule work fine

 Use whatever works fine for your use case.

 My personal opinion/expectation is to see submodules
 improving/advancing more than subtrees advancing in the near future.
 Though this is neither the official nor a strong opinion.

 Stefan

 
  Suggestions are very welcome.
  Thanks in advance,
 
  Jean Audibert
 
 
  _
 
  This message may contain confidential information and is intended for 
  specific recipients unless explicitly noted otherwise. If you have reason 
  to believe you are not an intended recipient of this message, please delete 
  it and notify the sender. This message may not represent the opinion of 
  Euronext N.V. or any of its subsidiaries or affiliates, and does not 
  constitute a contract or guarantee. Unencrypted electronic mail is not 
  secure and the recipient of this message is expected to provide safeguards 
  from viruses and pursue alternate means of communication where privacy or a 
  binding message is desired.
 
  --
  To unsubscribe from this list: send the line unsubscribe git in
  the body of a message to majord...@vger.kernel.org
  More majordomo info at  http://vger.kernel.org/majordomo-info.html
 --
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Eric Sunshine
On Tue, Jun 23, 2015 at 3:01 PM, Remi Galan Alfonso
remi.galan-alfo...@ensimag.grenoble-inp.fr wrote:
 Eric Sunshine sunsh...@sunshineco.com writes:
  +test_rebase_end () {
  +   test_when_finished git checkout master 
  +   git branch -D $1 
  +   test_might_fail git rebase --abort 
  +   git checkout -b $1 master
  +}

 The way this is indented makes it difficult to see that lines 2 and 3
 are continuations of 1. Perhaps format it like this instead?

 test_rebase_end () {
 test_when_finished git checkout master 
 git branch -D $1 
 test_might_fail git rebase --abort 
 git checkout -b $1 master
 }

 I completely agree with you, moreover it was indented like this before.
 I'll change it in my local version for now.

 Ironically, it was modified after the following:

 Galan Rémi remi.galan-alfo...@ensimag.grenoble-inp.fr writes:
 Eric Sunshine sunsh...@sunshineco.com writes:
   +test_expect_success 'rebase -i respects 
   rebase.missingCommitsCheck=ignore' '
   +   test_config rebase.missingCommitsCheck ignore 
   +   test_when_finished git checkout master 
   +   git branch -D tmp2 
 
  Strange indentation.

Clearly, that guy who made the Strange indentation review comment
didn't know what he was talking about. ;-)

In that earlier review, I must have missed the fact that the quoted
string spanned multiple lines, and, unfortunately, got too busy with
other things to say ah, the indentation makes perfect sense when
your response pointed out its true nature.

Matthieu's suggestion of formatting it like:

test_when_finished 
... 
...
 

should help to avoid such misconceptions.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread David Turner
On Tue, 2015-06-23 at 17:51 +0200, Michael Haggerty wrote:
 On 06/23/2015 02:50 AM, David Turner wrote:
  I've revived and modified Ronnie Sahlberg's work on the refs db
  backend.  
  
  The work is on top of be3c13e5564, Junio's First batch for 2.5 cycle.
  I recognize that there have been changes to the refs code since then,
  and that there are some further changes in-flight from e.g. Michael
  Haggerty.  If there is interest in this, I can rebase once Michael's
  changes land.
 
 It's awesome that you are working on this!
 
 I'm reading through your commits and will add comments as they pop into
 my head...
 
 * I initially read refs-be-files to be a short version of references,
 they be files. I might never be able to get that pronunciation out of
 my head :-)

That's OK so long as I can keep pronouncing reflog as re-flog. ;)

 * It would be more modest to call the files implementing the LMDB
 backend refs-be-lmdb.[c,h] rather than refs-be-db.[c,h].

Agreed.  Will fix.

 * I wonder whether `refname_is_safe()` might eventually have to become
 backend-specific. For example, maybe one backend will have to impose a
 limit of 128 characters or something? No matter, though...it can be
 moved later.

I think it would be an error to allow backends to impose additional
limits on ref names.  The limits imposed by the current backend have
been the cause of much sadness here at Twitter (primarily,
case-conflicts combined with d/f conflicts).

 * You have put `format_reflog_msg()` in the public interface. It
 probably makes sense, because more than one backend might want to use
 it. But another backend might want to store (refname, old_sha1,
 new_sha1, ...) as separate columns in a database. As long as
 `format_reflog_msg()` is seen as a helper function and is not called by
 any of the shared code, it shouldn't be a problem.

Agreed.

 * I wonder whether `init_backend()` will be general enough. 

We can always upgrade it later.

 * Your methods for bulk updates are I think analogous to the
 `initial_ref_transaction_commit()` function that I recently submitted
 [1]. Either way, the goal is to abstract away the fact that the
 file-based backend uses packed and loose references with tradeoffs that
 callers currently have to know about.

Yes, I saw your work after I had already started mine.

 * I don't like the fact that you have replaced `struct ref_transaction
 *` with `void *` in the public interface. On a practical level, I like
 the bit of type-safety that comes with the more specific declaration.
 But on a more abstract level, I think that the concept of a transaction
 could be useful across backends, for example in utility functions that
 verify that a proposed set of updates are internally consistent. I would
 rather see either
 
   * backends extend a basic `struct ref_transaction` to suit their
 needs, and upcast/downcast pointers at the module boundary, or
 
   * `struct ref_transaction` itself gets a `void *` member that backends
 can use for whatever purposes they want.

There are no common fields between refs-be-file transactions and
refs-be-lmdb transactions.  I don't see much gain from adding an empty
ref_transaction that backends could extend, since we would have to
explicitly upcast/downcast all over the place.

 * Regarding MERGE_HEAD: you take the point of view that it must continue
 to be stored as a file. And yet it must also behave somewhat like a
 reference; for example, `git rev-parse MERGE_HEAD` works today.
 MERGE_HEAD is also used for reachability, right?
 
 Another point of view is that MERGE_HEAD is a plain old boring
 reference, but there is some other metadata related to it that the refs
 backend has to store. The file-based backend would have special-case
 code to read the additional data from the tail of the loose refs file
 (and be sure to write the metadata when writing the reference), but
 other backends could store the reference with the rest but do their own
 thing with the metadata. So I guess I'm wondering whether the refs API
 needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with
 its metadata.

You are probably right that this is a good idea.

 * Don't the same considerations that apply to MERGE_HEAD also apply to
 FETCH_HEAD?

All of the tests pass without any special handling of FETCH_HEAD.

 * I'm showing my ignorance of LMDB, but I don't see where the LMDB
 backend initializes its database during `git init-db`. Is that what
 `init_env()` does? But it looks like `init_env()` is called on every git
 invocation (via `git_config_early()`). Puzzled.

There is no need to explicitly create the database (other than with
mkdir); init_env does everything for you.

 * Meanwhile, `create_default_files()` (in `builtin/init-db`) still
 creates directories `refs`, `refs/heads`, and `refs/tags`.

Yeah, that's legit.  I'll create a backend initdb function, and rename
init to setup.

 * Rehash of the last two points: I expected one backend function that is
 used to 

Re: [PATCH v7 5/5] bisect: allows any terms set by user

2015-06-23 Thread Junio C Hamano
Matthieu Moy matthieu@imag.fr writes:

Subject: Re: [PATCH v7 5/5] bisect: allows any terms set by user

s/allows/allow/;

 +Alternative terms: use your own terms
 +

Lengths of underline and the text must match.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git status --diffstat?

2015-06-23 Thread Jacek Wielemborek
Hi,

I recently realized that I could use a git status syntax like this:

On branch master
Your branch is up-to-date with 'origin/master'.
Changes not staged for commit:
  (use git add file... to update what will be committed)
  (use git checkout -- file... to discard changes in working directory)

modified:   macd.py 2 +-
modified:   macd.wsgi 2 +-
modified:   macd/admin.py 4 
modified:   macd/index.html 2 +-
modified:   macd/models.py 7 +--
modified:   macd/settings.py27 +--
modified:   macd/urls.py 6 ++
modified:   macd/views.py27 +++

(...)

The idea is to add diffstats to git-status. I cooked up a Python script
to do that [1], but I'd like that to be default behavior on my box.
Someone suggested me to just implement it in C, but I'm not familiar
with the codebase, so it'd take me a while.

What do you think about this? Would anybody else find it useful and
perhaps consider implementing it?

Cheers,
d33tah



signature.asc
Description: OpenPGP digital signature


Re: [PATCH v7 0/5] git bisect old/new

2015-06-23 Thread Junio C Hamano
Matthieu Moy matthieu@imag.fr writes:

 I fixed a few minor issues in v6. Patch between my version and v6 is
 below. I also pushed my branch here:

   https://github.com/moy/git/tree/bisect-terms

It is somewhat confusing to see v3 yesterday and then this v7 next
day.  How did I miss v4 thru v6?

 Not visible in the patch below: I squashed PATCH 5 into PATCH 3 to
 avoid the pattern break stuff and then repair it.

Good.

 The first two patches seem ready.

Yeah, the first one is obviously fine ;-), and I agree the second
one looks more or less OK.

Regarding the second and third one, the messages they give when the
user marked one tip of a side branch as old and the other new gave
me a hiccup while reading them, though.

if (!strcmp(name_bad, bad)) {
fprintf(stderr, The merge base %s is bad.\n
This means the bug has been fixed 
between %s and [%s].\n,
bad_hex, bad_hex, good_hex);
} else {
fprintf(stderr, The merge base %s is %s.\n
This means the first commit marked %s is 
between %s and [%s].\n,
bad_hex, name_bad, name_bad, bad_hex, good_hex);

The bad side is inherited from the original and not your fault,
but it was already very hard to understand. The other side is not
just unreadable, but I think is incorrect and confusing to say
first commit marked %(name_bad)s; you know there are history
segments whose oldest ends (i.e. merge base that is bad) are marked
as 'bad', and the other ends are marked as 'good', and you haven't
marked any of the commits in between yet.  So there is no first
commit marked either as bad or good there between these endpoints
(yet).

Also I was somewhat puzzled and disappointed to still see
name_{bad,good} not name_{new,old} used as variable names even in
the endgame patch, though.  Is that intended?

 PATCH 4 (add old/new) is still buggy. When starting a bisection with

   git bisect start $old $new

 the command git bisect visualize does not work (it shows no commit).

 I consider PATCH 5 as WIP, I think it would need a lot of polishing
 and testing to be mergeable. I think a reasonable objective for now it
 to get old/new working in the user-interface, and drop PATCH 5 from
 the series when it gets merged. The existance of PATCH 5 is a very
 good thing even if it doesn't get merged:

 * The fact that it's possible to do it on top of the series shows that
   we make the code more generic. I think it's important that
   regardless of features, the code moves in the right direction.

 * The patch can be taken over later by someone else.

Yeah, if I may rephrase to make sure we are on the same page, in
order for 5/5 to be done sanely, 1-4/5 must be giving a good
foundation to build on.  I agree with that, I agree that including a
polished 5/5 would be a good thing, and then I further agree that
1-4/5 could be delivered before 5/5 is ready.

Thanks.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: RFC/Pull Request: Refs db backend

2015-06-23 Thread Randall S. Becker
 -Original Message-
 From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On
 Behalf Of David Turner
 Sent: June 23, 2015 4:22 PM
 To: Randall S. Becker
 Cc: 'Stefan Beller'; 'git mailing list'; 'ronnie sahlberg'
 Subject: Re: RFC/Pull Request: Refs db backend
 
  Just to beg a request: LMDB is not available on some MPP architectures to
 which git has been ported. If it comes up, I beg you not to add this as a
 dependency to base git components.
 
 My changes make `configure` check for the presence of liblmdb. The LMDB
 code is only built if liblmdb is present.  So, I think we're good.

Thanks :) You have no idea how much, in a burnt by that in other projects POV.

Cheers,
Randall

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Junio C Hamano
Matthieu Moy matthieu@grenoble-inp.fr writes:

 Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes:

 I think that the indentation on its own is enough to avoid confusion
 test_rebase_end () {
 test_when_finished git checkout master 
 git branch -D $1 
 test_might_fail git rebase --abort 
 git checkout -b $1 master
 }
 but your idea is fine as well, so I'm ok with either way.

 Read too quickly, it looks like a mis-indentation (I could laugh at Eric
 here, but I made the same confusion when reading the code at first). By
 avoid the confusion I mean make it clear it's not a mis-indentation.

Yes, that stray  fooled me as well.  If it were following your
suggestion in the earlier message on this thread, i.e.

test_when_finished 
... 
...
 
git checkout

I wouldn't have to waste time commenting on it ;-)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Remi Galan Alfonso
Junio C Hamano gits...@pobox.com writes:
 Matthieu Moy matthieu@grenoble-inp.fr writes:
 
  Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes:
 
  I think that the indentation on its own is enough to avoid confusion
  test_rebase_end () {
  test_when_finished git checkout master 
  git branch -D $1 
  test_might_fail git rebase --abort 
  git checkout -b $1 master
  }
  but your idea is fine as well, so I'm ok with either way.
 
  Read too quickly, it looks like a mis-indentation (I could laugh at Eric
  here, but I made the same confusion when reading the code at first). By
  avoid the confusion I mean make it clear it's not a mis-indentation.
 
 Yes, that stray  fooled me as well.  If it were following your
 suggestion in the earlier message on this thread, i.e.
 
 test_when_finished 
 ... 
 ...
  
 git checkout
 
 I wouldn't have to waste time commenting on it ;-)

I will do it this way then. ;)

Thanks,
Rémi
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread Junio C Hamano
David Turner dtur...@twopensource.com writes:

 On Tue, 2015-06-23 at 15:53 -0400, David Turner wrote:
  * Regarding MERGE_HEAD: you take the point of view that it must continue
  to be stored as a file. And yet it must also behave somewhat like a
  reference; for example, `git rev-parse MERGE_HEAD` works today.
  MERGE_HEAD is also used for reachability, right?
  
  Another point of view is that MERGE_HEAD is a plain old boring
  reference, but there is some other metadata related to it that the refs
  backend has to store. The file-based backend would have special-case
  code to read the additional data from the tail of the loose refs file
  (and be sure to write the metadata when writing the reference), but
  other backends could store the reference with the rest but do their own
  thing with the metadata. So I guess I'm wondering whether the refs API
  needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with
  its metadata.
 
 You are probably right that this is a good idea.

 On reflection, I think it might make sense to keep MERGE_HEAD as a file.
 The problem is that not only would refs backends have to add new
 MERGE_HEAD-handling functions, but we would also need new plumbing
 commands to allow scripts to access the complete contents of MERGE_HEAD.
 That seems more complicated to me.  

I think you are talking about FETCH_HEAD, but I tend to agree.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread David Turner
On Tue, 2015-06-23 at 10:16 -0700, Stefan Beller wrote:
  The db backend code was added in the penultimate commit; the rest is
  just code rearrangement and minor changes to make alternate backends
  possible.  There ended up being a fair amount of this 
  rearrangement,  but the end result is that almost the entire git 
  test suite runs under the db backend without error (see below for 
 details).
 
 Looking at the end result in refs-be-db.c it feels like there are more
 functions in the refs_be_db struct, did this originate from other 
 design choices? IIRC Ronnie wanted to have as least functions in 
 there as possible, and share as much of the code between the 
 databases, such that the glue between the db and the refs code is 
 minimal.

I didn't actually spend that much time reading Ronnie's backend code.
My code aims to be extremely thoroughly compatible.  I spent a ton of
time making sure that the git test suite passed.  I don't know if an
alternate approach would have been as compatible.

The requirement for reflog storage did complicate things a bit.

I also didn't see a strong need to abstract the database, since LMDB is
common, widely compatible, and tiny.  

 Some random comments from looking over the branch briefly:
 
 In the latest commit, (refs: tests for db backend), I am unsure about 
 the copyright annotations. At least a sole Copyright (c) 2007 Junio C
 Hamano doesn't make sense to me. ;)

Will fix, thanks.

 Typo in commit message bisect: use refs insfrastructure for 
 BISECT_START

Will fix, thanks.

 Some commits contain a ChangeId, which is a Gerrit leftover. :(

Those were leftover from Ronnie's patches; since you are a Googler and
you think we don't need them, I'll remove them. 


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: RFC/Pull Request: Refs db backend

2015-06-23 Thread Randall S. Becker
 -Original Message-
 From: git-ow...@vger.kernel.org [mailto:git-ow...@vger.kernel.org] On
 Behalf Of David Turner
 Sent: June 23, 2015 4:05 PM
 To: Stefan Beller
 Cc: git mailing list; ronnie sahlberg
 Subject: Re: RFC/Pull Request: Refs db backend
 
 On Tue, 2015-06-23 at 10:16 -0700, Stefan Beller wrote:
   The db backend code was added in the penultimate commit; the rest is
   just code rearrangement and minor changes to make alternate backends
   possible.  There ended up being a fair amount of this rearrangement,
   but the end result is that almost the entire git test suite runs
   under the db backend without error (see below for
  details).
 
  Looking at the end result in refs-be-db.c it feels like there are more
  functions in the refs_be_db struct, did this originate from other
  design choices? IIRC Ronnie wanted to have as least functions in there
  as possible, and share as much of the code between the databases, such
  that the glue between the db and the refs code is minimal.
 
 I didn't actually spend that much time reading Ronnie's backend code.
 My code aims to be extremely thoroughly compatible.  I spent a ton of time
 making sure that the git test suite passed.  I don't know if an alternate
 approach would have been as compatible.
 
 The requirement for reflog storage did complicate things a bit.
 
 I also didn't see a strong need to abstract the database, since LMDB is 
 common,
 widely compatible, and tiny.

Just to beg a request: LMDB is not available on some MPP architectures to which 
git has been ported. If it comes up, I beg you not to add this as a dependency 
to base git components.

Thanks,
Randall

-- Brief whoami: NonStopUNIX developer since approximately 
UNIX(421664400)/NonStop(2112884442)
-- In my real life, I talk too much.



--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 01/10] t9001-send-email: move script creation in a setup test

2015-06-23 Thread Remi Lespinet
Move the creation of the scripts used in to-cmd and cc-cmd tests
in a setup test to make them available for later tests.

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 t/t9001-send-email.sh | 15 +--
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index a3663da..eef12e6 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -312,13 +312,19 @@ test_expect_success $PREREQ,!AUTOIDENT 'broken implicit 
ident aborts send-email'
)
 '
 
+test_expect_success $PREREQ 'setup tocmd and cccmd scripts' '
+   write_script tocmd-sed -\EOF 
+   sed -n -e s/^tocmd--//p $1
+   EOF
+   write_script cccmd-sed -\EOF
+   sed -n -e s/^cccmd--//p $1
+   EOF
+'
+
 test_expect_success $PREREQ 'tocmd works' '
clean_fake_sendmail 
cp $patches tocmd.patch 
echo tocmd--to...@example.com tocmd.patch 
-   write_script tocmd-sed -\EOF 
-   sed -n -e s/^tocmd--//p $1
-   EOF
git send-email \
--from=Example nob...@example.com \
--to-cmd=./tocmd-sed \
@@ -332,9 +338,6 @@ test_expect_success $PREREQ 'cccmd works' '
clean_fake_sendmail 
cp $patches cccmd.patch 
echo cccmd--  cc...@example.com cccmd.patch 
-   write_script cccmd-sed -\EOF 
-   sed -n -e s/^cccmd--//p $1
-   EOF
git send-email \
--from=Example nob...@example.com \
--to=nob...@example.com \
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 06/10] send-email: minor code refactoring

2015-06-23 Thread Remi Lespinet
Group expressions in a single if statement. This avoid checking
multiple time if the variable $sender is defined.

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 git-send-email.perl | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index f61449d..a0cd7ff 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -799,9 +799,9 @@ if (!$force) {
}
 }
 
-($sender) = expand_aliases($sender) if defined $sender;
-
-if (!defined $sender) {
+if (defined $sender) {
+   ($sender) = expand_aliases($sender);
+} else {
$sender = $repoauthor || $repocommitter || '';
 }
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 04/10] send-email: refactor address list process

2015-06-23 Thread Remi Lespinet
Simplify code by creating a function which transform a list of strings
containing email addresses (separated by commas, comporting aliases)
into a clean list of valid email addresses.

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 git-send-email.perl | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 8bf38ee..2d5c530 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -833,12 +833,9 @@ sub expand_one_alias {
return $aliases{$alias} ? expand_aliases(@{$aliases{$alias}}) : $alias;
 }
 
-@initial_to = expand_aliases(@initial_to);
-@initial_to = validate_address_list(sanitize_address_list(@initial_to));
-@initial_cc = expand_aliases(@initial_cc);
-@initial_cc = validate_address_list(sanitize_address_list(@initial_cc));
-@bcclist = expand_aliases(@bcclist);
-@bcclist = validate_address_list(sanitize_address_list(@bcclist));
+@initial_to = process_address_list(@initial_to);
+@initial_cc = process_address_list(@initial_cc);
+@bcclist = process_address_list(@bcclist);
 
 if ($thread  !defined $initial_reply_to  $prompting) {
$initial_reply_to = ask(
@@ -1051,6 +1048,13 @@ sub sanitize_address_list {
return (map { sanitize_address($_) } @_);
 }
 
+sub process_address_list {
+   my @addr_list = expand_aliases(@_);
+   @addr_list = sanitize_address_list(@addr_list);
+   @addr_list = validate_address_list(@addr_list);
+   return @addr_list;
+}
+
 # Returns the local Fully Qualified Domain Name (FQDN) if available.
 #
 # Tightly configured MTAa require that a caller sends a real DNS
@@ -1560,10 +1564,8 @@ foreach my $t (@files) {
($confirm =~ /^(?:auto|compose)$/  $compose  $message_num 
== 1));
$needs_confirm = inform if ($needs_confirm  $confirm_unconfigured 
 @cc);
 
-   @to = expand_aliases(@to);
-   @to = validate_address_list(sanitize_address_list(@to));
-   @cc = expand_aliases(@cc);
-   @cc = validate_address_list(sanitize_address_list(@cc));
+   @to = process_address_list(@to);
+   @cc = process_address_list(@cc);
 
@to = (@initial_to, @to);
@cc = (@initial_cc, @cc);
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 07/10] send-email: reduce dependencies impact on parse_address_line

2015-06-23 Thread Remi Lespinet
parse_address_line had not the same behavior whether the user had
Mail::Address or not. Teach parse_address_line to behave like
Mail::Address.

When the user input is correct, this implementation behaves
exactly like Mail::Address except when there are quotes
inside the name:

  Jane Doe j...@example.com

In this case the result of parse_address_line is:

  With M::A : Jane Do e j...@example.com
  Without   : Jane Do e j...@example.com

When the user input is not correct, the behavior is also mostly
the same.

Unlike Mail::Address, this doesn't parse groups and recursive
commentaries.

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 git-send-email.perl  |  2 +-
 perl/Git.pm  | 67 
 t/t9000-addresses.sh | 30 +++
 t/t9000/test.pl  | 67 
 4 files changed, 165 insertions(+), 1 deletion(-)
 create mode 100755 t/t9000-addresses.sh
 create mode 100755 t/t9000/test.pl

diff --git a/git-send-email.perl b/git-send-email.perl
index a0cd7ff..bced78e 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -478,7 +478,7 @@ sub parse_address_line {
if ($have_mail_address) {
return map { $_-format } Mail::Address-parse($_[0]);
} else {
-   return split_addrs($_[0]);
+   return Git::parse_mailboxes($_[0]);
}
 }
 
diff --git a/perl/Git.pm b/perl/Git.pm
index 9026a7b..19ef081 100644
--- a/perl/Git.pm
+++ b/perl/Git.pm
@@ -864,6 +864,73 @@ sub ident_person {
return $ident[0] $ident[1];
 }
 
+=item parse_mailboxes
+
+Return an array of mailboxes extracted from a string.
+
+=cut
+
+sub parse_mailboxes {
+   my $re_comment = qr/\((?:[^)]*)\)/;
+   my $re_quote = qr/(?:[^\\\]|\\.)*/;
+   my $re_word = qr/(?:[^][\s():;@\\,.]|\\.)+/;
+
+   # divide the string in tokens of the above form
+   my $re_token = qr/(?:$re_quote|$re_word|$re_comment|\S)/;
+   my @tokens = map { $_ =~ /\s*($re_token)\s*/g } @_;
+
+   # add a delimiter to simplify treatment for the last mailbox
+   push @tokens, ,;
+
+   my (@addr_list, @phrase, @address, @comment, @buffer) = ();
+   foreach my $token (@tokens) {
+   if ($token =~ /^[,;]$/) {
+   # if buffer still contains undeterminated strings
+   # append it at the end of @address or @phrase
+   if (@address) {
+   push @address, @buffer;
+   } else {
+   push @phrase, @buffer;
+   }
+
+   my $str_phrase = join ' ', @phrase;
+   my $str_address = join '', @address;
+   my $str_comment = join ' ', @comment;
+
+   # quote are necessary if phrase contains
+   # special characters
+   if ($str_phrase =~ /[][():;@\\,.\000-\037\177]/) {
+   $str_phrase =~ s/(^|[^\\])/$1/g;
+   $str_phrase = qq[$str_phrase];
+   }
+
+   # add  around the address if necessary
+   if ($str_address ne   $str_phrase ne ) {
+   $str_address = qq[$str_address];
+   }
+
+   my $str_mailbox = $str_phrase $str_address 
$str_comment;
+   $str_mailbox =~ s/^\s*|\s*$//g;
+   push @addr_list, $str_mailbox if ($str_mailbox);
+
+   @phrase = @address = @comment = @buffer = ();
+   } elsif ($token =~ /^\(/) {
+   push @comment, $token;
+   } elsif ($token eq ) {
+   push @phrase, (splice @address), (splice @buffer);
+   } elsif ($token eq ) {
+   push @address, (splice @buffer);
+   } elsif ($token eq @) {
+   push @address, (splice @buffer), @;
+   } elsif ($token eq .) {
+   push @address, (splice @buffer), .;
+   } else {
+   push @buffer, $token;
+   }
+   }
+
+   return @addr_list;
+}
 
 =item hash_object ( TYPE, FILENAME )
 
diff --git a/t/t9000-addresses.sh b/t/t9000-addresses.sh
new file mode 100755
index 000..7223d03
--- /dev/null
+++ b/t/t9000-addresses.sh
@@ -0,0 +1,30 @@
+#!/bin/sh
+#
+# Copyright (c) 2015
+#
+
+test_description='compare address parsing with and without Mail::Address'
+. ./test-lib.sh
+
+if ! test_have_prereq PERL; then
+   skip_all='skipping perl interface tests, perl not available'
+   test_done
+fi
+
+perl -MTest::More -e 0 2/dev/null || {
+   skip_all=Perl Test::More unavailable, skipping test
+   test_done
+}
+
+perl -MMail::Address -e 0 2/dev/null || {
+   skip_all=Perl Mail::Address 

[PATCH v6 02/10] send-email: allow aliases in patch header and command script outputs

2015-06-23 Thread Remi Lespinet
Interpret aliases in:

  -  Header fields of patches generated by git format-patch
 (using --to, --cc, --add-header for example) or
 manually modified. Example of fields in header:

  To: alias1
  Cc: alias2
  Cc: alias3

  -  Outputs of command scripts specified by --cc-cmd and
 --to-cmd. Example of script:

  #!/bin/sh
  echo alias1
  echo alias2

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 git-send-email.perl   |  2 ++
 t/t9001-send-email.sh | 60 +++
 2 files changed, 62 insertions(+)

diff --git a/git-send-email.perl b/git-send-email.perl
index 6bedf74..8bf38ee 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1560,7 +1560,9 @@ foreach my $t (@files) {
($confirm =~ /^(?:auto|compose)$/  $compose  $message_num 
== 1));
$needs_confirm = inform if ($needs_confirm  $confirm_unconfigured 
 @cc);
 
+   @to = expand_aliases(@to);
@to = validate_address_list(sanitize_address_list(@to));
+   @cc = expand_aliases(@cc);
@cc = validate_address_list(sanitize_address_list(@cc));
 
@to = (@initial_to, @to);
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index eef12e6..f7d4132 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1579,6 +1579,66 @@ test_expect_success $PREREQ 
'sendemail.aliasfiletype=sendmail' '
grep ^!o@example\.com!$ commandline1
 '
 
+test_expect_success $PREREQ 'alias support in To header' '
+   clean_fake_sendmail 
+   echo alias sbd  some...@example.org .mailrc 
+   test_config sendemail.aliasesfile .mailrc 
+   test_config sendemail.aliasfiletype mailrc 
+   git format-patch --stdout -1 --to=sbd aliased.patch 
+   git send-email \
+   --from=Example nob...@example.com \
+   --smtp-server=$(pwd)/fake.sendmail \
+   aliased.patch \
+   2errors out 
+   grep ^!someone@example\.org!$ commandline1
+'
+
+test_expect_success $PREREQ 'alias support in Cc header' '
+   clean_fake_sendmail 
+   echo alias sbd  some...@example.org .mailrc 
+   test_config sendemail.aliasesfile .mailrc 
+   test_config sendemail.aliasfiletype mailrc 
+   git format-patch --stdout -1 --cc=sbd aliased.patch 
+   git send-email \
+   --from=Example nob...@example.com \
+   --smtp-server=$(pwd)/fake.sendmail \
+   aliased.patch \
+   2errors out 
+   grep ^!someone@example\.org!$ commandline1
+'
+
+test_expect_success $PREREQ 'tocmd works with aliases' '
+   clean_fake_sendmail 
+   echo alias sbd  some...@example.org .mailrc 
+   test_config sendemail.aliasesfile .mailrc 
+   test_config sendemail.aliasfiletype mailrc 
+   git format-patch --stdout -1 tocmd.patch 
+   echo tocmd--sbd tocmd.patch 
+   git send-email \
+   --from=Example nob...@example.com \
+   --to-cmd=./tocmd-sed \
+   --smtp-server=$(pwd)/fake.sendmail \
+   tocmd.patch \
+   2errors out 
+   grep ^!someone@example\.org!$ commandline1
+'
+
+test_expect_success $PREREQ 'cccmd works with aliases' '
+   clean_fake_sendmail 
+   echo alias sbd  some...@example.org .mailrc 
+   test_config sendemail.aliasesfile .mailrc 
+   test_config sendemail.aliasfiletype mailrc 
+   git format-patch --stdout -1 cccmd.patch 
+   echo cccmd--sbd cccmd.patch 
+   git send-email \
+   --from=Example nob...@example.com \
+   --cc-cmd=./cccmd-sed \
+   --smtp-server=$(pwd)/fake.sendmail \
+   cccmd.patch \
+   2errors out 
+   grep ^!someone@example\.org!$ commandline1
+'
+
 do_xmailer_test () {
expected=$1 params=$2 
git format-patch -1 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 03/10] t9001-send-email: refactor header variable fields replacement

2015-06-23 Thread Remi Lespinet
Create a function which replaces Date, Message-Id and
X-Mailer lines generated by git-send-email by a specific string:

Date:.*$   - Date: DATE-STRING
Message-Id:.*$ - Message-Id: MESSAGE-ID-STRING
X-Mailer:.*$   - X-Mailer: X-MAILER-STRING
Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 t/t9001-send-email.sh | 11 +++
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index f7d4132..714fcae 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -522,6 +522,12 @@ Result: OK
 EOF
 
 
+replace_variable_fields () {
+   sed -e s/^\(Date:\).*/\1 DATE-STRING/ \
+   -e s/^\(Message-Id:\).*/\1 MESSAGE-ID-STRING/ \
+   -e s/^\(X-Mailer:\).*/\1 X-MAILER-STRING/
+}
+
 test_suppression () {
git send-email \
--dry-run \
@@ -529,10 +535,7 @@ test_suppression () {
--from=Example f...@example.com \
--to=t...@example.com \
--smtp-server relay.example.com \
-   $patches |
-   sed -e s/^\(Date:\).*/\1 DATE-STRING/ \
-   -e s/^\(Message-Id:\).*/\1 MESSAGE-ID-STRING/ \
-   -e s/^\(X-Mailer:\).*/\1 X-MAILER-STRING/ \
+   $patches | replace_variable_fields \
actual-suppress-$1${2+-$2} 
test_cmp expected-suppress-$1${2+-$2} actual-suppress-$1${2+-$2}
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 05/10] send-email: Allow use of aliases in the From field of --compose mode

2015-06-23 Thread Remi Lespinet
Aliases were expanded before considering the From field of the
--compose option. This is inconsistent with other fields
(To, Cc, ...) which already support aliases.

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 git-send-email.perl | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index 2d5c530..f61449d 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -555,8 +555,6 @@ if (@alias_files and $aliasfiletype and defined 
$parse_alias{$aliasfiletype}) {
}
 }
 
-($sender) = expand_aliases($sender) if defined $sender;
-
 # is_format_patch_arg($f) returns 0 if $f names a patch, or 1 if
 # $f is a revision list specification to be passed to format-patch.
 sub is_format_patch_arg {
@@ -801,6 +799,8 @@ if (!$force) {
}
 }
 
+($sender) = expand_aliases($sender) if defined $sender;
+
 if (!defined $sender) {
$sender = $repoauthor || $repocommitter || '';
 }
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 07/10] send-email: reduce dependencies impact on parse_address_line

2015-06-23 Thread Matthieu Moy
Your git send-email does not seem to like PATCHes 08-10/10 ;-).

Up to PATCH 07, the series looks good.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 08/10] send-email: consider quote as delimiter instead of character

2015-06-23 Thread Remi Lespinet
Do not consider quote inside a recipient name as character when
they are not escaped. This interprets:

  Jane Doe j...@example.com

as:

  Jane Doe j...@example.com

instead of:

  Jane\ \Doe j...@example.com

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 git-send-email.perl | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/git-send-email.perl b/git-send-email.perl
index bced78e..a03392c 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -1028,15 +1028,17 @@ sub sanitize_address {
return $recipient;
}
 
+   # remove non-escaped quotes
+   $recipient_name =~ s/(^|[^\\])/$1/g;
+
# rfc2047 is needed if a non-ascii char is included
if ($recipient_name =~ /[^[:ascii:]]/) {
-   $recipient_name =~ s/^(.*)$/$1/;
$recipient_name = quote_rfc2047($recipient_name);
}
 
# double quotes are needed if specials or CTLs are included
elsif ($recipient_name =~ /[][()@,;:\\.\000-\037\177]/) {
-   $recipient_name =~ s/([\\\r])/\\$1/g;
+   $recipient_name =~ s/([\\\r])/\\$1/g;
$recipient_name = qq[$recipient_name];
}
 
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: What's cooking in git.git (Jun 2015, #05; Mon, 22)

2015-06-23 Thread Junio C Hamano
Johannes Schindelin johannes.schinde...@gmx.de writes:

 Hi Junio,

 On 2015-06-23 00:49, Junio C Hamano wrote:

 * js/rebase-i-clean-up-upon-continue-to-skip (2015-06-18) 3 commits
  - rebase -i: do not leave a CHERRY_PICK_HEAD file behind
  - SQUASH: test_must_fail is a shell function
  - t3404: demonstrate CHERRY_PICK_HEAD bug
 
  Abandoning an already applied change in git rebase -i with
  --continue left CHERRY_PICK_HEAD and confused later steps.
 
  Expecting a reroll.
  ($gmane/271856)

 Actually, there had been two re-rolls, and v3 seemed to be okay:
 $gmane/272037. It also looks as if 726a35ebd^..726a35ebd^2 is
 identical with v3... Anything you want me to change in addition?

Thanks for a pointer; I think I updated the topic and then forgot to
update the reference in whats cooking.  I'll take a look at 272037
and if I have anything further will comment there.

 Also:

 * js/fsck-opt (2015-06-22) 19 commits
  - fsck: support ignoring objects in `git fsck` via fsck.skiplist
  - fsck: git receive-pack: support excluding objects from fsck'ing
  - fsck: introduce `git fsck --connectivity-only`
  - fsck: support demoting errors to warnings
  - fsck: document the new receive.fsck.msg-id options
  - fsck: allow upgrading fsck warnings to errors
  - fsck: optionally ignore specific fsck issues completely
  - fsck: disallow demoting grave fsck errors to warnings
  - fsck: add a simple test for receive.fsck.msg-id
  - fsck: make fsck_tag() warn-friendly
  - fsck: handle multiple authors in commits specially
  - fsck: make fsck_commit() warn-friendly
  - fsck: make fsck_ident() warn-friendly
  - fsck: report the ID of the error/warning
  - fsck (receive-pack): allow demoting errors to warnings
  - fsck: offer a function to demote fsck errors to warnings
  - fsck: provide a function to parse fsck message IDs
  - fsck: introduce identifiers for fsck messages
  - fsck: introduce fsck options
 
  Rerolled (at v7) and seems more or less ready for 'next'.

 I see that you used `downcased` instead of my `lowercased`, which
 makes more sense, but the style of the multi-line `for` loop as per
 pu` is still as *I* wrote it... I also saw that you downcased the
 first letter after `fsck:` in the commit messages, and touched up the
 message of the report the ID of the error/warning commit. Do you
 want to touch up the `for` loop style in offer a function to demote
 fsck errors to warnings or shall I send a v8 (it is ready to go:
 https://github.com/dscho/git/compare/next...fsck-api)?

I often micro-tweak obvious things as I go over the series and
applying them one by one, but the for-layout was a kind of change
that I usually do not tweak during application (as there are larger
chances of causing unneeded conflicts with later patches, and at the
point of applying an earlier patch, I may not remember what I
learned by skimming later patches in the series) and left there.

If the for-layout is the only thing that is questionable thing to
fix in what I queued, I think I can locally fix-up without an extra
roundtrip.  

Thanks.




--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/5] git bisect old/new

2015-06-23 Thread Matthieu Moy
Junio C Hamano gits...@pobox.com writes:

 Matthieu Moy matthieu@imag.fr writes:

 I fixed a few minor issues in v6. Patch between my version and v6 is
 below. I also pushed my branch here:

   https://github.com/moy/git/tree/bisect-terms

 It is somewhat confusing to see v3 yesterday and then this v7 next
 day.  How did I miss v4 thru v6?

Oops, I pattern-matched the wrong part when reading [PATCH v3 6/6].
Indeed, this should have been numberred v4, I should have s/v6/v3/ in my
sentence above.

 Regarding the second and third one, the messages they give when the
 user marked one tip of a side branch as old and the other new gave
 me a hiccup while reading them, though.

   if (!strcmp(name_bad, bad)) {
   fprintf(stderr, The merge base %s is bad.\n
   This means the bug has been fixed 
   between %s and [%s].\n,
   bad_hex, bad_hex, good_hex);
   } else {
   fprintf(stderr, The merge base %s is %s.\n
   This means the first commit marked %s is 
   between %s and [%s].\n,
   bad_hex, name_bad, name_bad, bad_hex, good_hex);

 The bad side is inherited from the original and not your fault,
 but it was already very hard to understand. The other side is not
 just unreadable, but I think is incorrect and confusing to say
 first commit marked %(name_bad)s; you know there are history
 segments whose oldest ends (i.e. merge base that is bad) are marked
 as 'bad', and the other ends are marked as 'good', and you haven't
 marked any of the commits in between yet.  So there is no first
 commit marked either as bad or good there between these endpoints
 (yet).

The situation is (the bisection started with bad=branch1 and
good=branch2):

 base (name_bad) -- branch1 (name_bad)
\
 `- branch2 (name_good)

So, the first commit having property name_good is between base and
branch2. So, the message seems wrong in two ways:

* As you say, the wording marked as seem to imply that we did mark the
  commit, while we actually did not explore base..branch2
  I'd write the first '%s' commit (the important is to remove
  marked).

* The message uses 'name_bad' where it should use 'name_good'. Indeed,
  the original message talks about bug fixed, i.e. the first _good_
  commit is in base..branch2.

Is this what you meant?

(Sorry, I need drawings and bullet lists to think properly ;-) ).

Actually, I think it would make sense to include my drawing above in a
comment in the code.

 Also I was somewhat puzzled and disappointed to still see
 name_{bad,good} not name_{new,old} used as variable names even in
 the endgame patch, though.  Is that intended?

I still think that name_old/name_new would have been much better names
if we were to write bisect from scratch, but I got convinced by
Christian that name_bad/name_good made sense too. Even if we adopted
old/new in these variables, we would still have e.g. a variable
bad_seen in the code. So, moving the codebase to adopt the old/new
convention internally is more than just chosing the name of variables to
avoid hardcoded good/bad string litterals. Moving the brain of
developpers to adopt the old/new convention is even another debate.

I don't know if this is the reason why Antoine did not change it, but
that's why I did not insist further and did not do the change myself.

 Yeah, if I may rephrase to make sure we are on the same page, in
 order for 5/5 to be done sanely, 1-4/5 must be giving a good
 foundation to build on.  I agree with that, I agree that including a
 polished 5/5 would be a good thing, and then I further agree that
 1-4/5 could be delivered before 5/5 is ready.

Yes.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 3/3] git rebase -i: add static check for commands and SHA-1

2015-06-23 Thread Remi Galan Alfonso
Junio C Hamano gits...@pobox.com writes:
 Galan Rémi  remi.galan-alfo...@ensimag.grenoble-inp.fr writes:
 
   I used:
 read -r command sha1 rest EOF
 $line
 EOF
   because
 printf '%s' $line | read -r command sha1 rest
   doesn't work (the 3 variables have no value as a result).
   There might be a better way to do this, but I don't have it right now.
 
 while read line
 do
 (
 IFS=' '
 set x $line
 shift
 # now $1 is your command, $2 is sha1, $3 is remainder
 ...
 )
 done
 
 perhaps?

Will try, thanks!

 But more importantly, why do you even need to keep the bad ones in a
 separate .badcmd and .badsha files?  Isn't that bloating your changes
 unnecessarily, iow, if you issued your warning as you encounter them,
 wouldn't the change become cleaner and easier to understand (and as
 a side effect it may even become smaller)?  The _only_ thing that
 you would get by keeping them in temporary files is that you can do
 one header and bunch of errors, but is it so common to make a bad
 edit to the insn sheet that a sequence of errors, one per line
 becomes more burdensome to the end user?
 
 I would think
 
 stripspace |
 while read -r command sha1 rest
 do
 ...
 
 and showing the warning as you detect inside that loop would be
 sufficient.  Perhaps I am missing subtle details of what you are
 doing.

You're not missing subtle details, it is as you said, I tough it would
be clearer for the user to have one header and a bunch of errors.
Moreover while it would make the patch smaller and easier to
understand, I am not sure about making it cleaner; I guess I will have
to try and see how it ends up.

What I'm not completely happy with your proposition is the fact that 
if there are multiple errors of the same kind, the output would look 
something like:
 Warning: the command isn't recognized in the following line:
 badcmd1 some_sha some_commit_message
 Warning: the command isn't recognized in the following line:
 badcmd2 some_sha some_commit_message
(I don't think it would be good to squash some understandable warning
message and the faulty line in one line, it would probably end up
being too long)
However as you say, such mistakes are uncommon so I guess it's fine.

Thanks,
Rémi
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 Galan Rémi  remi.galan-alfo...@ensimag.grenoble-inp.fr writes:

 +test_rebase_end () {
 +test_when_finished git checkout master 
 +git branch -D $1 

 Is this one guaranteed to succeed?  Do we want to consider it a
 failure to remove $1 (e.g. dropTest)?

 $ git branch -D no-such-branch ; echo $?
 error: branch 'no-such-branch' not found.
 1

 If dropTest branch did not exist before the test that begins with
 a call to this function, what happens?

 Besides, a function that must be called at the beginning of a test
 piece has a name that ends with _end?  That sounds funny, no?

Ah, scratch this last paragraph.  I didn't see this is a
multi-command when_finished.

But other parts of what I said still stands.  For example, even in a
multi-command when_finished, git branch -D $1  if the main
body of the test failed to create the branch $1, that command
would fail and skip the remainder of the clean-up, so the first
point above is still suspect.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 10/10] send-email: suppress meaningless whitespaces in from field

2015-06-23 Thread Remi Lespinet
Remove leading and trailing whitespaces in from field before
interepreting it to improve consistency with other options.  The
split_addrs function already take care of trailing and leading
whitespaces for to, cc and bcc fields.
The from option now:

 - has the same behavior when passing arguments like
 j...@example.com , \t j...@example.com  or
   j...@example.com.

 - interprets aliases in string containing leading and trailing
   whitespaces such as  alias or alias\t like other options.

Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 git-send-email.perl   |  1 +
 t/t9001-send-email.sh | 24 
 2 files changed, 25 insertions(+)

diff --git a/git-send-email.perl b/git-send-email.perl
index 8bf6656..749d809 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -786,6 +786,7 @@ if (!$force) {
 }
 
 if (defined $sender) {
+   $sender =~ s/^\s+|\s+$//g;
($sender) = expand_aliases($sender);
 } else {
$sender = $repoauthor || $repocommitter || '';
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 3c5b853..8e21fb0 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1719,4 +1719,28 @@ test_expect_success $PREREQ 'aliases work with email 
list' '
test_cmp expected-list actual-list
 '
 
+test_expect_success $PREREQ 'leading and trailing whitespaces are removed' '
+   echo alias to2 t...@example.com .mutt 
+   echo alias cc1 Cc 1 c...@example.com .mutt 
+   test_config sendemail.aliasesfile .mutt 
+   test_config sendemail.aliasfiletype mutt 
+   TO1=$(echo QTo 1 t...@example.com | q_to_tab) 
+   TO2=$(echo QZto2 | qz_to_tab_space) 
+   CC1=$(echo cc1 | append_cr) 
+   BCC1=$(echo Q b...@example.com Q | q_to_nul) 
+   git send-email \
+   --dry-run \
+   --from=Example f...@example.com \
+   --to=$TO1 \
+   --to=$TO2 \
+   --to=  t...@example.com\
+   --cc=$CC1 \
+   --cc=Cc2 c...@example.com \
+   --bcc=$BCC1 \
+   --bcc=b...@example.com \
+   0001-add-master.patch | replace_variable_fields \
+   actual-list 
+   test_cmp expected-list actual-list
+'
+
 test_done
-- 
1.9.1

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v6 09/10] send-email: allow multiple emails using --cc, --to and --bcc

2015-06-23 Thread Remi Lespinet
Accept a list of emails separated by commas in flags --cc, --to and
--bcc.  Multiple addresses can already be given by using these options
multiple times, but it is more convenient to allow cutting-and-pasting
a list of addresses from the header of an existing e-mail message,
which already lists them as comma-separated list, as a value to a
single parameter.

The following format can now be used:

$ git send-email --to='Jane j...@example.com, m...@example.com'

Remove the limitation imposed by 79ee555b (Check and document the
options to prevent mistakes, 2006-06-21) which rejected every argument
with comma in --cc, --to and --bcc.

Helped-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
Signed-off-by: Mathieu Lienard--Mayor mathieu.lienard--ma...@ensimag.imag.fr
Signed-off-by: Jorge Juan Garcia Garcia 
jorge-juan.garcia-gar...@ensimag.imag.fr
Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr
Signed-off-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr
---
 Documentation/git-send-email.txt | 12 +--
 git-send-email.perl  | 17 ++--
 t/t9001-send-email.sh| 44 
 3 files changed, 52 insertions(+), 21 deletions(-)

diff --git a/Documentation/git-send-email.txt b/Documentation/git-send-email.txt
index b48a764..afd9569 100644
--- a/Documentation/git-send-email.txt
+++ b/Documentation/git-send-email.txt
@@ -49,17 +49,17 @@ Composing
of 'sendemail.annotate'. See the CONFIGURATION section for
'sendemail.multiEdit'.
 
---bcc=address::
+--bcc=address,...::
Specify a Bcc: value for each email. Default is the value of
'sendemail.bcc'.
 +
-The --bcc option must be repeated for each user you want on the bcc list.
+This option may be specified multiple times.
 
---cc=address::
+--cc=address,...::
Specify a starting Cc: value for each email.
Default is the value of 'sendemail.cc'.
 +
-The --cc option must be repeated for each user you want on the cc list.
+This option may be specified multiple times.
 
 --compose::
Invoke a text editor (see GIT_EDITOR in linkgit:git-var[1])
@@ -110,13 +110,13 @@ is not set, this will be prompted for.
Only necessary if --compose is also set.  If --compose
is not set, this will be prompted for.
 
---to=address::
+--to=address,...::
Specify the primary recipient of the emails generated. Generally, this
will be the upstream maintainer of the project involved. Default is the
value of the 'sendemail.to' configuration value; if that is unspecified,
and --to-cmd is not specified, this will be prompted for.
 +
-The --to option must be repeated for each user you want on the to list.
+This option may be specified multiple times.
 
 --8bit-encoding=encoding::
When encountering a non-ASCII message or subject that does not
diff --git a/git-send-email.perl b/git-send-email.perl
index a03392c..8bf6656 100755
--- a/git-send-email.perl
+++ b/git-send-email.perl
@@ -460,20 +460,6 @@ my ($repoauthor, $repocommitter);
 ($repoauthor) = Git::ident_person(@repo, 'author');
 ($repocommitter) = Git::ident_person(@repo, 'committer');
 
-# Verify the user input
-
-foreach my $entry (@initial_to) {
-   die Comma in --to entry: $entry'\n unless $entry !~ m/,/;
-}
-
-foreach my $entry (@initial_cc) {
-   die Comma in --cc entry: $entry'\n unless $entry !~ m/,/;
-}
-
-foreach my $entry (@bcclist) {
-   die Comma in --bcclist entry: $entry'\n unless $entry !~ m/,/;
-}
-
 sub parse_address_line {
if ($have_mail_address) {
return map { $_-format } Mail::Address-parse($_[0]);
@@ -1051,7 +1037,8 @@ sub sanitize_address_list {
 }
 
 sub process_address_list {
-   my @addr_list = expand_aliases(@_);
+   my @addr_list = map { parse_address_line($_) } @_;
+   @addr_list = expand_aliases(@addr_list);
@addr_list = sanitize_address_list(@addr_list);
@addr_list = validate_address_list(@addr_list);
return @addr_list;
diff --git a/t/t9001-send-email.sh b/t/t9001-send-email.sh
index 714fcae..3c5b853 100755
--- a/t/t9001-send-email.sh
+++ b/t/t9001-send-email.sh
@@ -1675,4 +1675,48 @@ test_expect_success $PREREQ '--[no-]xmailer with 
sendemail.xmailer=false' '
do_xmailer_test 1 --xmailer
 '
 
+test_expect_success $PREREQ 'setup expected-list' '
+   git send-email \
+   --dry-run \
+   --from=Example f...@example.com \
+   --to=To 1 t...@example.com \
+   --to=t...@example.com \
+   --to=t...@example.com \
+   --cc=Cc 1 c...@example.com \
+   --cc=Cc2 c...@example.com \
+   --bcc=b...@example.com \
+   --bcc=b...@example.com \
+   0001-add-master.patch | replace_variable_fields \
+   expected-list
+'
+
+test_expect_success $PREREQ 'use email list in --cc --to and --bcc' '
+   git send-email \
+   --dry-run \
+   --from=Example f...@example.com \
+   --to=To 1 t...@example.com, 

[PATCH v6 07/10] send-email: reduce dependencies impact on parse_address_line

2015-06-23 Thread Remi LESPINET
Matthieu Moy matthieu@grenoble-inp.fr writes:

 Your git send-email does not seem to like PATCHes 08-10/10 ;-).

 Up to PATCH 07, the series looks good.

Yes, I get Too many recipients error... If I specify
--no-signoff-by-cc then this is also aborted but I get no error (at
least I've not seen it last time...). If I rerun git send-email with
only one patch, it works (even if there is no difference with the
number of recipient a priori). I'll investigate asap, not
sure it's a bug, maybe It's just me !
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread Michael Haggerty
On 06/23/2015 09:53 PM, David Turner wrote:
 On Tue, 2015-06-23 at 17:51 +0200, Michael Haggerty wrote:
 [...]
 * I don't like the fact that you have replaced `struct ref_transaction
 *` with `void *` in the public interface. On a practical level, I like
 the bit of type-safety that comes with the more specific declaration.
 But on a more abstract level, I think that the concept of a transaction
 could be useful across backends, for example in utility functions that
 verify that a proposed set of updates are internally consistent. I would
 rather see either

   * backends extend a basic `struct ref_transaction` to suit their
 needs, and upcast/downcast pointers at the module boundary, or

   * `struct ref_transaction` itself gets a `void *` member that backends
 can use for whatever purposes they want.
 
 There are no common fields between refs-be-file transactions and
 refs-be-lmdb transactions.  I don't see much gain from adding an empty
 ref_transaction that backends could extend, since we would have to
 explicitly upcast/downcast all over the place.

If you ask me, it would be better to do a bunch of up/downcasts within
the single module (via two helper functions that could even do
consistency checks) than have no help from the compiler in preventing
people from passing unrelated pointer types into the `void *transaction`
argument. Plus the `struct ref_transaction *` variables scattered
throughout the code are a lot more self-explanatory than `void *`.

 * Regarding MERGE_HEAD: you take the point of view that it must continue
 to be stored as a file. And yet it must also behave somewhat like a
 reference; for example, `git rev-parse MERGE_HEAD` works today.
 MERGE_HEAD is also used for reachability, right?

 Another point of view is that MERGE_HEAD is a plain old boring
 reference, but there is some other metadata related to it that the refs
 backend has to store. The file-based backend would have special-case
 code to read the additional data from the tail of the loose refs file
 (and be sure to write the metadata when writing the reference), but
 other backends could store the reference with the rest but do their own
 thing with the metadata. So I guess I'm wondering whether the refs API
 needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with
 its metadata.
 
 You are probably right that this is a good idea.
 
 * Don't the same considerations that apply to MERGE_HEAD also apply to
 FETCH_HEAD?
 
 All of the tests pass without any special handling of FETCH_HEAD.

That's odd. From git-fetch.txt:

The names of refs that are fetched, together with the object names
they point at, are written to `.git/FETCH_HEAD`.  This information
may be used by scripts or other git commands, such as
linkgit:git-pull[1].

It seems like the test suite is reading FETCH_HEAD via the refs API in a
couple of places. I don't understand why these don't fail when LMDB is
being used...

 * I'm showing my ignorance of LMDB, but I don't see where the LMDB
 backend initializes its database during `git init-db`. Is that what
 `init_env()` does? But it looks like `init_env()` is called on every git
 invocation (via `git_config_early()`). Puzzled.
 
 There is no need to explicitly create the database (other than with
 mkdir); init_env does everything for you.

OK.

 * Rehash of the last two points: I expected one backend function that is
 used to initialize the refs backend when a new repository is created
 (e.g., in `git init`). The file-based backend would use this function to
 create the `refs`, `refs/heads`, and `refs/tags` directories. I expected
 a second function that is called once every time git runs in an existing
 repository (this one might, for example, open a database connection).
 And maybe even a third one that closes down the database connection
 before git exits. Would you please explain how this actually works?
 
 LMDB doesn't really have the concept of a connection.  It's basically
 just a couple of files that communicate using shared memory (and maybe
 some other locking that I haven't paid attention to).  There is the
 concept of a transaction, which is the unit of concurrency (each
 thread may only have one open transaction).  Transactions are either
 read-only or read-write, and there can only be one read-write
 transaction open at a time (across the entire system).  Read-only
 transactions take a snapshot of the DB state at transaction start time.
 This combination of features means that we need to be a bit clever about
 read-only transactions; if a read-write transaction occurs in a separate
 process, we need to restart any read-only transactions to pick up its
 changes.

If you are thinking about an *unrelated* separate process, then Git's
philosophy is that if our process is reading *some* valid state of the
references, it's all good even if that state is not quite the newest.
After all, who's to say whether our process ran before or after the
other process? As long as each 

Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread David Turner
On Tue, 2015-06-23 at 06:54 -0400, Jeff King wrote:
 + mkconfig 1 preciousObjects .git/config 

nit: I think it's better to use git config rather than manually writing
config files.  git config is more future-proof if we up switching config
backends; it's also more composable with other configs (making this test
easier to copy and manipulate), and more explicit.

Other than that it looks good to me.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] Enable core.fsyncObjectFiles by default

2015-06-23 Thread Stefan Beller
Linus Torvalds started a discussion[1] if we want to play rather safe
than use defaults which make sense only for the most power users of Git:

 So git is safe in the sense that you won't really lose any data,
 but you may well be inconvenienced.  The fsync each object config
 option is there in case you don't want that inconvenience, but it
 should be noted that it can make for a hell of a performance impact.

 Of course, it might well be the case that the actual default
 might be worth turning around. Most git users probably don't
 care about that kind of apply two hundred patches from Andrew
 Morton kind of workload, although rebase a big patch-series
 does end up doing basically the same thing, and might be more
 common.

This patch enables fsync_object_files by default.

[1] https://plus.google.com/u/1/+JonathanCorbet/posts/JBxiKPe3VXa

Signed-off-by: Stefan Beller sbel...@google.com
---
 Documentation/config.txt | 8 
 environment.c| 2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 43bb53c..dce2640 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -693,10 +693,10 @@ core.whitespace::
 core.fsyncObjectFiles::
This boolean will enable 'fsync()' when writing object files.
 +
-This is a total waste of time and effort on a filesystem that orders
-data writes properly, but can be useful for filesystems that do not use
-journalling (traditional UNIX filesystems) or that only journal metadata
-and not file contents (OS X's HFS+, or Linux ext3 with data=writeback).
+This ensures objects are written to disk instead of relying on the
+operating systems cache and eventual write. Disabling this option will
+yield performance with a trade off in safety for repository corruption
+during power loss.
 
 core.preloadIndex::
Enable parallel index preload for operations like 'git diff'
diff --git a/environment.c b/environment.c
index 61c685b..b406f5e 100644
--- a/environment.c
+++ b/environment.c
@@ -35,7 +35,7 @@ const char *git_attributes_file;
 int zlib_compression_level = Z_BEST_SPEED;
 int core_compression_level;
 int core_compression_seen;
-int fsync_object_files;
+int fsync_object_files = 1;
 size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
 size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
 size_t delta_base_cache_limit = 96 * 1024 * 1024;
-- 
2.4.1.345.gab207b6.dirty

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Remi Galan Alfonso
Junio C Hamano gits...@pobox.com writes:
  Galan Rémi  remi.galan-alfo...@ensimag.grenoble-inp.fr writes:
  
   +test_rebase_end () {
   +test_when_finished git checkout master 
   +git branch -D $1 
  
  Is this one guaranteed to succeed?  Do we want to consider it a
  failure to remove $1 (e.g. dropTest)?
  
  $ git branch -D no-such-branch ; echo $?
  error: branch 'no-such-branch' not found.
  1
  
  If dropTest branch did not exist before the test that begins with
  a call to this function, what happens?
 

Since the function is
   test_when_finished git checkout master 
   git branch -D $1 
   test_might_fail git rebase --abort 
   git checkout -b $1 master
If $1 doesn't exist, then it means that 'git checkout -b $1 master'
failed, thus the test would fail before reaching the part in
'test_when_finished'.
However I guess there are more reasons that could cause 'git branch -D
$1' to fail so I'll add a 'test_might_fail' in front of it.

 Besides, a function that must be called at the beginning of a test
 piece has a name that ends with _end?  That sounds funny, no?

I see your point but I'm not really sure how to call it, considering
that what it does is creating a branch to test on it, and taking care
of the cleaning at the end of the test.
Maybe something more generic like setup_and_clean ?
 
  +test_might_fail git rebase --abort 
  +git checkout -b $1 master
  +}
 
 I'm wondering if this is not sufficient.
 
 test_rebase_i_drop_prepare () {
 git reset --hard 
 git checkout -B $1 master
 }
 
 I am guessing that you named _end because it has when_finished, but
 as far as I can tell, even after these three patches, the tests do
 not really rely on the fact that it is on 'master' branch.  More
 importantly, just being on 'master' branch is not a sufficient
 cleanliness for the next test (and that is why you added these
 branch -D and might-fail rebase --abort to this function in the
 first place), it seems.  So...

I removed the branch in case someone used the same name when creating
a branch in a future test. However he would notice it when writing the
test and it's not something that would suddenly break when modifying
the code, so it might not be necessary.
The might-fail rebase --abort is there in case this test fails in
the middle (because of a future modification of rebase for example) to
avoid failling all the following tests that use rebase.

The name test_rebase_i_drop_prepare wouldn't be accurate since 2/3
and 3/3 uses the function as well and don't really have much to do
with 'drop'.

Thanks,
Rémi
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v6 09/10] send-email: allow multiple emails using --cc, --to and --bcc

2015-06-23 Thread Matthieu Moy
Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr writes:

 Helped-by: Remi Lespinet remi.lespi...@ensimag.grenoble-inp.fr

This is funny in a patch sent by the same Remi Lespinet ;-).

Anyway, the whole series looks good to me now (I finally got all up to
10/10).

Cheers,

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Junio C Hamano
Jeff King p...@peff.net writes:

  This extension does not change git's behavior at all. It is useful only
  for testing format-1 compatibility.
 +
 +`preciousObjects`
 +~
 +
 +When the config key `extensions.preciousObjects` is set to `true`,
 +objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
 +`git repack -d`).

OK.  In essense, the 'extension' on the disk is like 'capability' on
the wire, in that you are not supposed to ask for capability they do
not understand, and you are not supposed to touch a repository you
do not understand.

And the above looks like a reasonable sample feature to protect by
the 'extension' system.

 + if (delete_redundant  repository_format_precious_objects)
 + die(cannot repack in a precious-objects repo);

This message initially threw me off during my cursory reading, but
the code tells me that this is only about repack -d.

Unfortunately the users do not get the chance to read the code;
perhaps s/cannot repack/ -d/; or something?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH v5 07/10] send-email: reduce dependancies impact on parse_address_line

2015-06-23 Thread Remi Lespinet
Matthieu Moy matthieu@grenoble-inp.fr writes:
  We can redirect todo_output to a variable but I've not found better...
  (Maybe someone has the solution here ?). Also there's no summary at
  the end of the test (as with other perl tests).
 
 You can get the 1..44 at the end with
 
 printf 1..%d\n, Test::More-builder-current_test;
 
 This is what t9700/test.pl does.

I can also get it by removing the line 

 Test::More-builder-no_ending(1);

and replacing

 use Test::More;

by

 use Test::More no_plan;

I think I'm going to do that, because the no_ending thing makes the
test suite success even if every test fails: at the end we have

# test_external test Perl address parsing function was ok
# test_external_without_stderr test no stderr: Perl address parsing function 
was ok

in case everything is ok. With the no_ending line, only the second
line reports failures, the first is always the same.
I think both must be marked red.


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCHv6 1/3] git-rebase -i: add command drop to remove a commit

2015-06-23 Thread Matthieu Moy
Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes:

 I think that the indentation on its own is enough to avoid confusion
 test_rebase_end () {
  test_when_finished git checkout master 
  git branch -D $1 
  test_might_fail git rebase --abort 
  git checkout -b $1 master
 }
 but your idea is fine as well, so I'm ok with either way.

Read too quickly, it looks like a mis-indentation (I could laugh at Eric
here, but I made the same confusion when reading the code at first). By
avoid the confusion I mean make it clear it's not a mis-indentation.

Anyway, we're just bikeshedding here. Any version is fine with me.

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread David Turner
 Just to beg a request: LMDB is not available on some MPP architectures to 
 which git has been ported. If it comes up, I beg you not to add this as a 
 dependency to base git components.

My changes make `configure` check for the presence of liblmdb. The LMDB
code is only built if liblmdb is present.  So, I think we're good.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 4/5] bisect: add the terms old/new

2015-06-23 Thread Matthieu Moy
Remi Galan Alfonso remi.galan-alfo...@ensimag.grenoble-inp.fr writes:

 Matthieu Moy matthieu@imag.fr writes:
 Signed-off-by: Matthieu Moy matthieu@grenoble-inp.fr
 Signed-off-by: Matthieu Moy matthieu@imag.fr

 Sounds like you went all out with this patch.

Ah, the first line was in the original patch, and the second is added by
send-email -s ... Both me and myself agree that one of them can be
removed indeed ;-).

-- 
Matthieu Moy
http://www-verimag.imag.fr/~moy/
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v7 0/5] git bisect old/new

2015-06-23 Thread Junio C Hamano
Matthieu Moy matthieu@grenoble-inp.fr writes:

 It is somewhat confusing to see v3 yesterday and then this v7 next
 day.  How did I miss v4 thru v6?

 Oops, I pattern-matched the wrong part when reading [PATCH v3 6/6].
 Indeed, this should have been numberred v4, I should have s/v6/v3/ in my
 sentence above.

OK.

 So, the first commit having property name_good is between base and
 branch2. So, the message seems wrong in two ways:

Yeah, I agree with you on both points.

 Actually, I think it would make sense to include my drawing above
 in a comment in the code.

Sounds like a good idea.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread David Turner
On Tue, 2015-06-23 at 15:53 -0400, David Turner wrote:
  * Regarding MERGE_HEAD: you take the point of view that it must continue
  to be stored as a file. And yet it must also behave somewhat like a
  reference; for example, `git rev-parse MERGE_HEAD` works today.
  MERGE_HEAD is also used for reachability, right?
  
  Another point of view is that MERGE_HEAD is a plain old boring
  reference, but there is some other metadata related to it that the refs
  backend has to store. The file-based backend would have special-case
  code to read the additional data from the tail of the loose refs file
  (and be sure to write the metadata when writing the reference), but
  other backends could store the reference with the rest but do their own
  thing with the metadata. So I guess I'm wondering whether the refs API
  needs a MERGE_HEAD-specific way to read and write MERGE_HEAD along with
  its metadata.
 
 You are probably right that this is a good idea.

On reflection, I think it might make sense to keep MERGE_HEAD as a file.
The problem is that not only would refs backends have to add new
MERGE_HEAD-handling functions, but we would also need new plumbing
commands to allow scripts to access the complete contents of MERGE_HEAD.
That seems more complicated to me.  

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread Jeff King
On Mon, Jun 22, 2015 at 08:50:56PM -0400, David Turner wrote:

 The db backend runs git for-each-ref about 30% faster than the files
 backend with fully-packed refs on a repo with ~120k refs.  It's also
 about 4x faster than using fully-unpacked refs.  In addition, and
 perhaps more importantly, it avoids case-conflict issues on OS X.

Neat.

Can you describe a bit more about the reflog handling?

One of the problems we've had with large-ref repos is that the reflog
storage is quite inefficient. You can pack all the refs, but you may
still be stuck with a bunch of reflog files with one entry, wasting a
whole inode. Doing a git repack when you have a million of those has
horrible cold-cache performance. Basically anything that isn't
one-file-per-reflog would be a welcome change. :)

It has also been a dream of mine to stop tying the reflogs specifically
to the refs. I.e., have a spot for reflogs of branches that no longer
exist, which allows us to retain them for deleted branches. Then you can
possibly recover from a branch deletion, whereas now you have to dig
through git fsck's dangling output. And the reflog, if you don't
expire it, becomes a suitable audit log to find out what happened to
each branch when (whereas now it is full of holes when things get
deleted).

Does your solution handle something like that?

I was thinking of actually moving to a log-structured ref storage.
Something like:

  - any ref write puts a line at the end of a single logfile that
contains the ref name, along with the normal reflog data

  - the logfile is the source of truth for the ref state. If you want to
know the value of any ref, you can read it backwards to find the
last entry for the ref. Everything else is an optimization.

Let's call the number of refs N, and the number of ref updates in
the log U.

  - we keep a key/value index mapping the name of any branch that exists
to the byte offset of its entry in the logfile. This would probably
be in some binary key/value store (like LMDB). Without this,
resolving a ref is O(U), which is horrible. With it, it should be
O(1) or O(lg N), depending on the index data structure.

  - the index can also contain other optimizations. E.g., rather than
point to the entry in the logfile, it can include the sha1 directly
(to avoid an extra level of indirection). It may want to include the
peeled value, as the current packed-refs file does.

  - Reading all of the reflogs (e.g., for repacking) is O(U), just like
it is today. Except the storage for the logfile is a lot more
compact than what we store today, with one reflog per file.

  - Reading a single reflog is _also_ O(U), which is not as good as
today. But if each log entry contains a byte offset of the previous
entry, you can follow the chain (it is still slightly worse, because
you are jumping all over the file, rather than reading a compact set
of lines).

  - Pruning the reflog entries from the logfile requires rewriting the
whole thing. That's similar to today, where we rewrite each of the
reflog files.

One of the nice properties of this system is that it should be very
resilient to corruption and races. Most of the operations are either
appending to a file, or writing to a tempfile and renaming in place.
The exception is the key/value index, but if we run into any problems
there, it can be rebuilt by walking over the logfile (for a cost of
O(U)).

I dunno. Maybe I am overthinking it. But it really feels like the _refs_
are a key/value thing, but the _reflogs_ are not. You can cram them into
a key/value store, but you're probably operating on them as a big blob,
then.

 I chose to use LMDB for the database.  LMDB has a few features that make
 it suitable for usage in git:

One of the complaints that Shawn had about sqlite is that there is no
native Java implementation, which makes it hard for JGit to ship a
compatible backend. I suspect the same is true for LMDB, but it is
probably a lot simpler than sqlite (so reimplementation might be
possible).

But it may also be worth going with a slightly slower database if we can
get wider compatibility for free.

 To test this backend's correctness, I hacked test-lib.sh and
 test-lib-functions.sh to run all tests under the refs backend. Dozens
 of tests use manual ref/reflog reading/writing, or create submodules
 without passing --refs-backend-type to git init.  If those tests are
 changed to use the update-ref machinery or test-refs-be-db (or, in the
 case of packed-refs, corrupt refs, and dumb fetch tests, are skipped),
 the only remaining failing tests are the git-new-workdir tests and the
 gitweb tests.

I think we'll need to bump core.repositoryformatversion, too. See the
patches I just posted here:

  http://thread.gmane.org/gmane.comp.version-control.git/272447

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More 

Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Jeff King
On Tue, Jun 23, 2015 at 06:14:22PM +0700, Duy Nguyen wrote:

  +   if (delete_redundant  repository_format_precious_objects)
  +   die(cannot repack in a precious-objects repo);
  +
  if (pack_kept_objects  0)
  pack_kept_objects = write_bitmaps;
 
 
 I know both commands have translatable strings that are not marked
 with _(). But if you reroll, maybe you could add _() for these new
 strings. It's even better if you mark all others in these commands
 too, if you have time.

Yeah, I agree these should be marked for translation. Thanks.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git name-rev not accepting abbreviated SHA with --stdin

2015-06-23 Thread Sitaram Chamarty
Hi all,

git name-rev does not accept abbreviated SHAs if --stdin is used,
though it works when the SHA is given directly on the command line:

$ git version
git version 2.4.3
$ git name-rev --tags d73f544
d73f544 tags/v3.6.3~29
$ git name-rev --tags --stdin  d73f544
d73f544

This *is* documented, but I'm curious why this distinction is made.  Is
it merely a matter of parsing or were there some other complications I
am unaware of, which forced this distinction to be made?

thanks
sitaram
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable core.fsyncObjectFiles by default

2015-06-23 Thread Jeff King
On Tue, Jun 23, 2015 at 02:57:23PM -0700, Stefan Beller wrote:

 Linus Torvalds started a discussion[1] if we want to play rather safe
 than use defaults which make sense only for the most power users of Git:
 
  So git is safe in the sense that you won't really lose any data,
  but you may well be inconvenienced.  The fsync each object config
  option is there in case you don't want that inconvenience, but it
  should be noted that it can make for a hell of a performance impact.
 
  Of course, it might well be the case that the actual default
  might be worth turning around. Most git users probably don't
  care about that kind of apply two hundred patches from Andrew
  Morton kind of workload, although rebase a big patch-series
  does end up doing basically the same thing, and might be more
  common.
 
 This patch enables fsync_object_files by default.

If you are looking for safety out of the box, I think this falls far
short, as we do not fsync all of the other files. For instance, we do
not fsync refs before they are written (nor anything else that uses the
commit_lock_file() interface to rename, such as the index).  We do
always fsync packfiles and their indices.

I had always assumed this was fine on ext4 with data=ordered (i.e.,
either the rename and its pointed-to content will go through, or not; so
you either get your update or the old state, but not a garbage or empty
file). But it sounds from what Ted wrote in:

  http://article.gmane.org/gmane.linux.file-systems/97255

that this may not be the case. If it's not, I think we should consider
fsyncing ref writes.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable core.fsyncObjectFiles by default

2015-06-23 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 I had always assumed this was fine on ext4 with data=ordered (i.e.,
 either the rename and its pointed-to content will go through, or not; so
 you either get your update or the old state, but not a garbage or empty
 file). But it sounds from what Ted wrote in:

   http://article.gmane.org/gmane.linux.file-systems/97255

 that this may not be the case. If it's not, I think we should consider
 fsyncing ref writes.

That matches my understanding.  IIRC, we do fsync (without a knob to
turn it off) packs, we do not fsync refs (as you described above),
the index, or working tree files (via write_entry()).  Among these:

 * If we are so paranoid that loss of new loose objects matter (as
   opposed to Heh, it is just the matter of 'git add' again) to
   cause us to set core.fsyncObjectFiles, we should definitely fsync
   ref writes.  They are even more precious.

 * The index is much less precious.  In the worst case, you can 'git
   reset' (no flags) and re-add from the working tree and nothing
   unrecoverable is lost.  I do not mind a knob to force us fsync,
   but we may want it to be separate from core.fsyncObjectFiles.

 * The working tree files?  I am not sure.  Again, recovery is just
   to run git diff to notice what was eaten by the power failure
   and then git checkout $path to restore from a known state,
   so...

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable core.fsyncObjectFiles by default

2015-06-23 Thread Junio C Hamano
Theodore Ts'o ty...@mit.edu writes:

 The main issue is that non-expert users might not realize that they
 really need to run git fsck after a crash; otherwise, what might
 happen is that although git is only appending, that you might have
 some zero-length (or partially written) git object or pack files, and
 while you might not notice at that moment, it might come and bite you
 later.

Regarding loose object files, given that we write to a temporary,
optionally fsync, close and then move to the final name, would we
still see partially written file if we omit the fsync, or would the
corruption be limited to either empty or missing?

The reason I am wondering is because the codepath to create an
object (i.e. update-index --add, hash-object -w, or add) first
checks if a packed or a loose object file _exists_ and if so
bypasses writing the same thing anew, but the existence check for a
loose object is to merely making sure that access(F_OK) (and
optionally utime()) succeeds.  If the potential breakage is limited
to truncation to empty, then we could replace it with stat(2) and
st.st_size check, as no loose object file can be empty.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


incomplete footers added by list server?

2015-06-23 Thread Dennis Kaarsemaker
Since last friday between 10:39 and 10:50 UTC, mails to git@vger
suddenly get an incomplete footer added.

Instead of the normal footer:

To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Only the first line is now added, actually making it fairly useless :)

-- 
Dennis Kaarsemaker
http://www.kaarsemaker.net

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: incomplete footers added by list server?

2015-06-23 Thread Johannes Schindelin
Hi Dennis,

On 2015-06-23 11:43, Dennis Kaarsemaker wrote:
 Since last friday between 10:39 and 10:50 UTC, mails to git@vger
 suddenly get an incomplete footer added.
 
 Instead of the normal footer:
 
 To unsubscribe from this list: send the line unsubscribe git in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 
 
 Only the first line is now added, actually making it fairly useless :)

Actually, I received this mail with the entire footer intact.

Ciao,
Dscho
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v3 12/19] initial_ref_transaction_commit(): check for duplicate refs

2015-06-23 Thread Michael Haggerty
On 06/22/2015 11:06 PM, Junio C Hamano wrote:
 Michael Haggerty mhag...@alum.mit.edu writes:
 
 Error out if the ref_transaction includes more than one update for any
 refname.

 Signed-off-by: Michael Haggerty mhag...@alum.mit.edu
 ---
  refs.c | 11 +++
  1 file changed, 11 insertions(+)
 
 This somehow feels like ehh, I now know better and this function
 should have been like this from the beginning to me.
 
 But that is OK.
 
 Is the initial creation logic too fragile to deserve its own
 function to force callers to think about it, by the way?
 
 What I am wondering is if we could turn the safety logic that appear
 here (i.e. no existing refs must be assumed by the set of updates,
 etc.)  into an optimization cue and implement this as a special case
 helper to ref_transaction_commit(), i.e.
 
   ref_transaction_commit(...)
 {
   if (updates are all initial creation 
 no existing refs in repository)
   return initial_ref_transaction_commit(...);
   /* otherwise we do the usual thing */
   ...
   }
 
 and have clone call ref_transaction_commit() as usual.

The safety logic in this function is (approximately) necessary, but not
sufficient, to guarantee safety. One of the shortcuts that it takes is
not locking the references while they are being created. Therefore, it
would be unsafe for one process to call ref_transaction_commit() while
another is calling initial_ref_transaction_commit(). So the caller has
to know somehow that no other processes are working in the repository
for this optimization to be safe. It conveys that knowledge by calling
initial_ref_transaction_commit() rather than ref_transaction_commit().

Of course the next question is, How does `git clone` know that no other
process is working in the new repository? Actually, it doesn't. For
example, I just verified that I can run

git clone $URL mygit 
sleep 0.1
cd mygit
git commit --allow-empty -m New root commit

and thereby overwrite the upstream `master` without the usual
non-fast-forward protection. I guess we are just relying on the user's
common sense not to run Git commands in a new repository before its
creation is complete.

I suppose we *could* special-case `git clone` to not finish the
initialization of the repository (for example, not write the `config`
file) until *after* the packed-refs file is written. This would prevent
other git processes from recognizing the directory as a Git repository
and so prevent them from running before the clone is finished.

But I think if anything it would make more sense to go the other direction:

* Teach ref_transaction_commit() an option that asks it to write
  references updates to packed-refs instead of loose refs (but
  locking the references as usual).

* Change clone to use ref_transaction_commit() like everybody
  else, passing it the new REFS_WRITE_TO_PACKED_REFS option.

Then clone would participate in the normal locking protocol, and it
wouldn't *matter* if another process runs before the clone is finished.
There would also be some consistency benefits. For example, if
core.logallrefupdates is set globally or on the command line, the
initial reference creations would be reflogged. And other operations
that write references in bulk could use the new
REFS_WRITE_TO_PACKED_REFS option to prevent loose reference proliferation.

But I don't think any of this is a problem in practice, and I think we
can live with using the optimized-but-not-100%-safe
initial_ref_transaction_commit() for cloning.

Michael

-- 
Michael Haggerty
mhag...@alum.mit.edu

--
To unsubscribe from this list: send the line unsubscribe git in


Dependency Management

2015-06-23 Thread Jean Audibert
Hi,

Sorry to bother you with this question but I can't find any official answer 
or strong opinion from Git community.

In my company we recently started to use Git and we wonder how to share code 
and manage dependencies with Git?
Use case: in project P we need to include lib-a and lib-b (libraries shared by 
several projects)

In your opinion, what is the future proof solution?
* Use submodule
* Use subtree

We know there is lot of PRO/CONS but I feel that subtree is behind in the 
race and the latest version of submodule work fine

Suggestions are very welcome.
Thanks in advance,

Jean Audibert


_

This message may contain confidential information and is intended for specific 
recipients unless explicitly noted otherwise. If you have reason to believe you 
are not an intended recipient of this message, please delete it and notify the 
sender. This message may not represent the opinion of Euronext N.V. or any of 
its subsidiaries or affiliates, and does not constitute a contract or 
guarantee. Unencrypted electronic mail is not secure and the recipient of this 
message is expected to provide safeguards from viruses and pursue alternate 
means of communication where privacy or a binding message is desired.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Duy Nguyen
On Tue, Jun 23, 2015 at 5:54 PM, Jeff King p...@peff.net wrote:
 diff --git a/builtin/prune.c b/builtin/prune.c
 index 0c73246..fc0c8e8 100644
 --- a/builtin/prune.c
 +++ b/builtin/prune.c
 @@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char 
 *prefix)
 return 0;
 }

 +   if (repository_format_precious_objects)
 +   die(cannot prune in a precious-objects repo);
 +
 while (argc--) {
 unsigned char sha1[20];
 const char *name = *argv++;
 diff --git a/builtin/repack.c b/builtin/repack.c
 index af7340c..8ae7fe5 100644
 --- a/builtin/repack.c
 +++ b/builtin/repack.c
 @@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char 
 *prefix)
 argc = parse_options(argc, argv, prefix, builtin_repack_options,
 git_repack_usage, 0);

 +   if (delete_redundant  repository_format_precious_objects)
 +   die(cannot repack in a precious-objects repo);
 +
 if (pack_kept_objects  0)
 pack_kept_objects = write_bitmaps;


I know both commands have translatable strings that are not marked
with _(). But if you reroll, maybe you could add _() for these new
strings. It's even better if you mark all others in these commands
too, if you have time.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: RFC/Pull Request: Refs db backend

2015-06-23 Thread Duy Nguyen
On Tue, Jun 23, 2015 at 12:36 PM, Junio C Hamano gits...@pobox.com wrote:
 If there is interest?  Shut up and take my money ;-)

Yeah. This may be the next big thing since pack bitmap. It's even
better if it enters 'master' hand in hand with pack protocol v2, but I
think v2 needs more time.

On Tue, Jun 23, 2015 at 7:50 AM, David Turner dtur...@twopensource.com wrote:
 To test this backend's correctness, I hacked test-lib.sh and
 test-lib-functions.sh to run all tests under the refs backend.

Now we have two. split-index also benefits from running through full
test suite like this. I propose we make make test run the test suite
twice. The first run is with default configuration, no split index, no
fancy ref backend. The second run enables split-index and switches to
new backend, running through all test cases. In future we can also
enable packv4 in this second run. There won't be a third run.

When the second ref backend comes, we can switch between the two
backends using a random number generator where we control both
algorithm and seed, so that when a test fails, the user can give us
their seed and we can re-run with the same configuration.

 Dozens of tests use manual ref/reflog reading/writing, or create submodules
 without passing --refs-backend-type to git init.  If those tests are
 changed to use the update-ref machinery or test-refs-be-db (or, in the
 case of packed-refs, corrupt refs, and dumb fetch tests, are skipped),
 the only remaining failing tests are the git-new-workdir tests and the
 gitweb tests.

I haven't read the series, but I guess you should also add a few tests
to run on the first run, so new code is exercised a bit even if people
skip the second run.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC/PATCH 0/2] bumping repository format version

2015-06-23 Thread Jeff King
We've managed to avoid bumping core.repositoryformatversion for the past
10 years, which is great. But I think there are some looming features
that are going to need it. The most obvious one is changing the ref
storage, where we need some way to tell older gits no, please don't
think that taking 'refs/heads/foo.lock' is sufficient to actually lock.

The first patch in this series is an attempt to pave the way for version
bumps like this in as painless a way as possible, by letting us mark
incompatible extensions by name. That way we can version things like
how do you lock a ref independent of the main repositoryformatversion
setting (just like we do for index version, for example).

See the explanation in the first patch for more details. The second
patch shows another use of the extension feature to provide safety
in shared-object repos against older versions of git.

  [1/2]: introduce extensions form of core.repositoryformatversion
  [2/2]: introduce preciousObjects repository extension

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 04/19] for-each-ref: add '--points-at' option

2015-06-23 Thread Karthik Nayak
On Tue, Jun 23, 2015 at 4:08 AM, Eric Sunshine sunsh...@sunshineco.com wrote:
 On Sun, Jun 21, 2015 at 4:48 PM, Karthik Nayak karthik@gmail.com wrote:
 Add the '--points-at' option provided by 'ref-filter'. The
 option lets the user to pick only refs which point to a particular
 commit.

 Add documentation and tests for the same.

 Based-on-patch-by: Jeff King p...@peff.net
 Mentored-by: Christian Couder christian.cou...@gmail.com
 Mentored-by: Matthieu Moy matthieu@grenoble-inp.fr
 Signed-off-by: Karthik Nayak karthik@gmail.com
 ---
 diff --git a/t/t6301-for-each-ref-filter.sh b/t/t6301-for-each-ref-filter.sh
 index b1fa8d4..67de3a7 100755
 --- a/t/t6301-for-each-ref-filter.sh
 +++ b/t/t6301-for-each-ref-filter.sh
 @@ -16,4 +16,24 @@ test_expect_success 'setup some history and refs' '
 git update-ref refs/odd/spot master
  '

 +test_expect_success 'filtering with --points-at' '
 +   cat expect -\EOF 
 +   refs/heads/master
 +   refs/odd/spot
 +   refs/tags/three
 +   EOF
 +   git for-each-ref --format=%(refname) --points-at=master actual 
 +   test_cmp expect actual
 +'
 +
 +test_expect_success 'check signed tags with --points-at' '
 +   cat expect -\EOF 
 +   refs/heads/side
 +   refs/tags/four
 +   refs/tags/signed-tag four
 +   EOF
 +   git for-each-ref  --format=%(refname) %(*subject) --points-at=side 
 actual 

 s/for-each-ref\s+/for-each-ref /


Will change! thanks :)


-- 
Regards,
Karthik Nayak
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 2/2] introduce preciousObjects repository extension

2015-06-23 Thread Jeff King
If this extension is used in a repository, then no
operations should run which may drop objects from the object
storage. This can be useful if you are sharing that storage
with other repositories whose refs you cannot see.

For instance, if you do:

  $ git clone -s parent child
  $ git -C parent config extensions.preciousObjects true
  $ git -C parent config core.repositoryformatversion 1

you now have additional safety when running git in the
parent repository. Prunes and repacks will bail with an
error, and `git gc` will skip those operations (it will
continue to pack refs and do other non-object operations).
Older versions of git, when run in the repository, will
fail on every operation.

Note that we do not set the preciousObjects extension by
default when doing a clone -s, as doing so breaks
backwards compatibility. It is a decision the user should
make explicitly.

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/technical/repository-version.txt |  7 +++
 builtin/gc.c   | 20 +++-
 builtin/prune.c|  3 +++
 builtin/repack.c   |  3 +++
 cache.h|  1 +
 environment.c  |  1 +
 setup.c|  2 ++
 t/t1302-repo-version.sh| 22 ++
 8 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
index 3d7106d..00ad379 100644
--- a/Documentation/technical/repository-version.txt
+++ b/Documentation/technical/repository-version.txt
@@ -79,3 +79,10 @@ The defined extensions are:
 
 This extension does not change git's behavior at all. It is useful only
 for testing format-1 compatibility.
+
+`preciousObjects`
+~
+
+When the config key `extensions.preciousObjects` is set to `true`,
+objects in the repository MUST NOT be deleted (e.g., by `git-prune` or
+`git repack -d`).
diff --git a/builtin/gc.c b/builtin/gc.c
index 36fe333..8b8dc6b 100644
--- a/builtin/gc.c
+++ b/builtin/gc.c
@@ -352,15 +352,17 @@ int cmd_gc(int argc, const char **argv, const char 
*prefix)
if (gc_before_repack())
return -1;
 
-   if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
-   return error(FAILED_RUN, repack.argv[0]);
-
-   if (prune_expire) {
-   argv_array_push(prune, prune_expire);
-   if (quiet)
-   argv_array_push(prune, --no-progress);
-   if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
-   return error(FAILED_RUN, prune.argv[0]);
+   if (!repository_format_precious_objects) {
+   if (run_command_v_opt(repack.argv, RUN_GIT_CMD))
+   return error(FAILED_RUN, repack.argv[0]);
+
+   if (prune_expire) {
+   argv_array_push(prune, prune_expire);
+   if (quiet)
+   argv_array_push(prune, --no-progress);
+   if (run_command_v_opt(prune.argv, RUN_GIT_CMD))
+   return error(FAILED_RUN, prune.argv[0]);
+   }
}
 
if (prune_worktrees_expire) {
diff --git a/builtin/prune.c b/builtin/prune.c
index 0c73246..fc0c8e8 100644
--- a/builtin/prune.c
+++ b/builtin/prune.c
@@ -218,6 +218,9 @@ int cmd_prune(int argc, const char **argv, const char 
*prefix)
return 0;
}
 
+   if (repository_format_precious_objects)
+   die(cannot prune in a precious-objects repo);
+
while (argc--) {
unsigned char sha1[20];
const char *name = *argv++;
diff --git a/builtin/repack.c b/builtin/repack.c
index af7340c..8ae7fe5 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -193,6 +193,9 @@ int cmd_repack(int argc, const char **argv, const char 
*prefix)
argc = parse_options(argc, argv, prefix, builtin_repack_options,
git_repack_usage, 0);
 
+   if (delete_redundant  repository_format_precious_objects)
+   die(cannot repack in a precious-objects repo);
+
if (pack_kept_objects  0)
pack_kept_objects = write_bitmaps;
 
diff --git a/cache.h b/cache.h
index bee425b..9b59a63 100644
--- a/cache.h
+++ b/cache.h
@@ -694,6 +694,7 @@ extern int grafts_replace_parents;
 #define GIT_REPO_VERSION 0
 #define GIT_REPO_VERSION_READ 1
 extern int repository_format_version;
+extern int repository_format_precious_objects;
 extern int check_repository_format(void);
 
 #define MTIME_CHANGED  0x0001
diff --git a/environment.c b/environment.c
index 61c685b..da66e82 100644
--- a/environment.c
+++ b/environment.c
@@ -26,6 +26,7 @@ int warn_ambiguous_refs = 1;
 int warn_on_object_refname_ambiguity = 1;
 int ref_paranoia = -1;
 int repository_format_version;

[PATCH 1/2] introduce extensions form of core.repositoryformatversion

2015-06-23 Thread Jeff King
Normally we try to avoid bumps of the whole-repository
core.repositoryformatversion field. However, it is
unavoidable if we want to safely change certain aspects of
git in a backwards-incompatible way (e.g., modifying the set
of ref tips that we must traverse to generate a list of
unreachable, safe-to-prune objects).

If we were to bump the repository version for every such
change, then any implementation understanding version `X`
would also have to understand `X-1`, `X-2`, and so forth,
even though the incompatibilities may be in orthogonal parts
of the system, and there is otherwise no reason we cannot
implement one without the other (or more importantly, that
the user cannot choose to use one feature without the other,
weighing the tradeoff in compatibility only for that
particular feature).

This patch documents the existing repositoryformatversion
strategy and introduces a new format, 1, which lets a
repository specify that it must run with an arbitrary set of
extensions. This can be used, for example:

 - to inform git that the objects should not be pruned based
   only on the reachability of the ref tips (e.g, because it
   has clone --shared children)

 - that the refs are stored in a format besides the usual
   refs and packed-refs directories

Because we bump to format 1, and because format 1
requires that a running git knows about any extensions
mentioned, we know that older versions of the code will not
do something dangerous when confronted with these new
formats.

For example, if the user chooses to use database storage for
refs, they may set the extensions.refbackend config to
db. Older versions of git will not understand format 1
and bail. Versions of git which understand 1 but do not
know about refbackend, or which know about refbackend
but not about the db backend, will refuse to run. This is
annoying, of course, but much better than the alternative of
claiming that there are no refs in the repository, or
writing to a location that other implementations will not
read.

Note that we are only defining the rules for format 1 here.
We do not ever write format 1 ourselves; it is a tool that
is meant to be used by users and future extensions to
provide safety with older implementations.

Signed-off-by: Jeff King p...@peff.net
---
 Documentation/technical/repository-version.txt | 81 ++
 cache.h|  6 ++
 setup.c| 37 +++-
 t/t1302-repo-version.sh| 38 
 4 files changed, 159 insertions(+), 3 deletions(-)
 create mode 100644 Documentation/technical/repository-version.txt

diff --git a/Documentation/technical/repository-version.txt 
b/Documentation/technical/repository-version.txt
new file mode 100644
index 000..3d7106d
--- /dev/null
+++ b/Documentation/technical/repository-version.txt
@@ -0,0 +1,81 @@
+Git Repository Format Versions
+==
+
+Every git repository is marked with a numeric version in the
+`core.repositoryformatversion` key of its `config` file. This version
+specifies the rules for operating on the on-disk repository data. An
+implementation of git which does not understand a particular version
+advertised by an on-disk repository MUST NOT operate on that repository;
+doing so risks not only producing wrong results, but actually losing
+data.
+
+Because of this rule, version bumps should be kept to an absolute
+minimum. Instead, we generally prefer these strategies:
+
+  - bumping format version numbers of individual data files (e.g.,
+index, packfiles, etc). This restricts the incompatibilities only to
+those files.
+
+  - introducing new data that gracefully degrades when used by older
+clients (e.g., pack bitmap files are ignored by older clients, which
+simply do not take advantage of the optimization they provide).
+
+A whole-repository format version bump should only be part of a change
+that cannot be independently versioned. For instance, if one were to
+change the reachability rules for objects, or the rules for locking
+refs, that would require a bump of the repository format version.
+
+Note that this applies only to accessing the repository's disk contents
+directly. An older client which understands only format `0` may still
+connect via `git://` to a repository using format `1`, as long as the
+server process understands format `1`.
+
+The preferred strategy for rolling out a version bump (whether whole
+repository or for a single file) is to teach git to read the new format,
+and allow writing the new format with a config switch or command line
+option (for experimentation or for those who do not care about backwards
+compatibility with older gits). Then after a long period to allow the
+reading capability to become common, we may switch to writing the new
+format by default.
+
+The currently defined format versions are:
+
+Version `0`
+---
+
+This is the format defined by 

Re: [PATCH v4 03/19] ref-filter: implement '--points-at' option

2015-06-23 Thread Karthik Nayak
On Tue, Jun 23, 2015 at 4:06 AM, Eric Sunshine sunsh...@sunshineco.com wrote:

 s/a one/one/


Noted! Thanks

-- 
Regards,
Karthik Nayak
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v4 07/19] for-each-ref: add '--merged' and '--no-merged' options

2015-06-23 Thread Karthik Nayak
On Tue, Jun 23, 2015 at 4:11 AM, Eric Sunshine sunsh...@sunshineco.com wrote:

 According to the documentation you added to the OPTIONS section, the
 object following --merged and --no-merged is optional. Therefore,
 shouldn't this be s/object/[object]/ ?

 Also, in the OPTIONS section, you spelled it commit rather than object.



 Ditto: s/object/[object]/


Noted and changed, this would apply to --contains also :)

-- 
Regards,
Karthik Nayak
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] apply: fix adding new files on i-t-a entries

2015-06-23 Thread Junio C Hamano
On Tue, Jun 23, 2015 at 9:50 AM, Junio C Hamano gits...@pobox.com wrote:
 Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 - pos = cache_name_pos(name, strlen(name));
 + pos = exists_in_cache(name, strlen(name));

 Something that is named as if it would return yes/no that returns a
 real value is not a very welcome idea.

 +/* This is the same as index_name_pos, except that i-t-a entries are 
 invisible */
 +int exists_in_index(const struct index_state *istate, const char *name, int 
 namelen)
 +{
 + int pos = index_name_stage_pos(istate, name, namelen, 0);
 +
 + if (pos  0)
 + return pos;
 + if (istate-cache[pos]-ce_flags  CE_INTENT_TO_ADD)
 + return -pos-1;

 This is a useless complexity.  Your callers cannot use the returned
 value like this:

 pos = exists_in_cache(...);
 if (pos  0) {
 if (active_cache[-pos-1]-ce_flags  CE_INTENT_TO_ADD)
 ; /* ah it actually exists but it is i-t-a */
 else
 ; /* no it does not really exist */
 } else {
 ; /* yes it is really there at pos */
 }

 because they cannot tell two cases apart: (1) you do have i-t-a with
 the given name, (2) you do not have the entry but the location you
 would insert an entry with such a name is occupied by an unrelated
 entry (i.e. with a name that sorts adjacent) that happens to be
 i-t-a.

Also, the callers cannot even use that return value in the usual way they
would use the return value from index_name_pos(), either.

pos = exists_in_cache(...);
if (pos  0) {
/* ah, it does not exist, so... */
pos = -1 - pos;
/*
 * ... it is OK to shift active_cache[pos..] by one and add our
 * entry at active_cache[pos]
 */
   } else {
/* it exists, so update in place */
;
   }

So, returning pos that smells like a return value from index_name_pos()
only has an effect of confusing callers into buggy code, I am afraid. The
callers that care need to be updated to check for ce_flags after finding the
entry with index_name_pos() the usual way if you want to avoid search in
the index_state-cache[] twice, and the callers that are only interested in
knowing if an entry exists are better off with an exists_in_cache() that
returns Yes/No and not a confusing and useless pos, I think.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable core.fsyncObjectFiles by default

2015-06-23 Thread Duy Nguyen
On Wed, Jun 24, 2015 at 4:57 AM, Stefan Beller sbel...@google.com wrote:
 Linus Torvalds started a discussion[1] if we want to play rather safe
 than use defaults which make sense only for the most power users of Git:

 So git is safe in the sense that you won't really lose any data,
 but you may well be inconvenienced.  The fsync each object config
 option is there in case you don't want that inconvenience, but it
 should be noted that it can make for a hell of a performance impact.

 Of course, it might well be the case that the actual default
 might be worth turning around. Most git users probably don't
 care about that kind of apply two hundred patches from Andrew
 Morton kind of workload, although rebase a big patch-series
 does end up doing basically the same thing, and might be more
 common.

 This patch enables fsync_object_files by default.

Will this make nfs performance a lot worse or still within acceptable range?
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable core.fsyncObjectFiles by default

2015-06-23 Thread Junio C Hamano
Stefan Beller sbel...@google.com writes:

 Linus Torvalds started a discussion[1] if we want to play rather safe
 than use defaults which make sense only for the most power users of Git:

 So git is safe in the sense that you won't really lose any data,
 but you may well be inconvenienced.  The fsync each object config
 option is there in case you don't want that inconvenience, but it
 should be noted that it can make for a hell of a performance impact.

 Of course, it might well be the case that the actual default
 might be worth turning around. Most git users probably don't
 care about that kind of apply two hundred patches from Andrew
 Morton kind of workload, although rebase a big patch-series
 does end up doing basically the same thing, and might be more
 common.

 This patch enables fsync_object_files by default.

Sorry, but I fail to see which part of what Linus said (which is the
only thing you quoted from the discussion) argues for this patch.
If anything, I read that as cautioning people against making a
tradeoff based on an incorrect perception of risks and blindly
flipping this bit ON (the original discussion a few days ago, where
Ted says he has this bit ON while clarifying that he does so on SSD,
is also a sensible description on how he made his trade-off).

It's a different matter whom you would want to align with when
assessing your own risk tolerance.  If you quoted Corbet's original
message, then that would have been more consistent.


 [1] https://plus.google.com/u/1/+JonathanCorbet/posts/JBxiKPe3VXa

 Signed-off-by: Stefan Beller sbel...@google.com
 ---
  Documentation/config.txt | 8 
  environment.c| 2 +-
  2 files changed, 5 insertions(+), 5 deletions(-)

 diff --git a/Documentation/config.txt b/Documentation/config.txt
 index 43bb53c..dce2640 100644
 --- a/Documentation/config.txt
 +++ b/Documentation/config.txt
 @@ -693,10 +693,10 @@ core.whitespace::
  core.fsyncObjectFiles::
   This boolean will enable 'fsync()' when writing object files.
  +
 -This is a total waste of time and effort on a filesystem that orders
 -data writes properly, but can be useful for filesystems that do not use
 -journalling (traditional UNIX filesystems) or that only journal metadata
 -and not file contents (OS X's HFS+, or Linux ext3 with data=writeback).
 +This ensures objects are written to disk instead of relying on the
 +operating systems cache and eventual write. Disabling this option will
 +yield performance with a trade off in safety for repository corruption
 +during power loss.
  
  core.preloadIndex::
   Enable parallel index preload for operations like 'git diff'
 diff --git a/environment.c b/environment.c
 index 61c685b..b406f5e 100644
 --- a/environment.c
 +++ b/environment.c
 @@ -35,7 +35,7 @@ const char *git_attributes_file;
  int zlib_compression_level = Z_BEST_SPEED;
  int core_compression_level;
  int core_compression_seen;
 -int fsync_object_files;
 +int fsync_object_files = 1;
  size_t packed_git_window_size = DEFAULT_PACKED_GIT_WINDOW_SIZE;
  size_t packed_git_limit = DEFAULT_PACKED_GIT_LIMIT;
  size_t delta_base_cache_limit = 96 * 1024 * 1024;
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] Enable core.fsyncObjectFiles by default

2015-06-23 Thread Theodore Ts'o
The main issue is that non-expert users might not realize that they
really need to run git fsck after a crash; otherwise, what might
happen is that although git is only appending, that you might have
some zero-length (or partially written) git object or pack files, and
while you might not notice at that moment, it might come and bite you
later.  If you do try to recover immediately after a crash, in the
worst case you might have to do that git am -s /tmp/mbox-filled-with
patches command, but otherwise you won't lose much data.

So perhaps one alternative strategy to calling fsync(2) after every
single git object file write might be to have git create a zero-length
.git/in-progress-pid file, which gets fsync'ed, and then it can do
the git am -s /tmp/akpm-big-bag-o-patches processing nice and fast,
and once git is done, then we call call sync(2) and then delete the
in-progress file.

If there is an in-progress file in the .git directory, git would then
automatically run git fsck to make sure the repository is consistent.

For people who care, maybe that's a good compromise.  (Me, the way
things are right now is just fine since I have a nice fast SSD, and so
setting fsyncObjectfiles is a perfectly fine thing as far as I am
concerned. :-)

   - Ted
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html