Re: [GIT PULL] sound fixes for 3.6-rc6

2012-09-13 Thread Jeff King
On Thu, Sep 13, 2012 at 02:28:51PM +0200, Takashi Iwai wrote:

> > FWIW, it was an output from git-pull-request, which fell back to the
> > equivalent branch.  Usually I check it manually but I forgot it at
> > this time just before going to a meeting.
> > 
> > This was with git 1.7.11.5.  I'll check whether this still happens
> > with 1.7.12.
> 
> The same problem still happens with git 1.7.12.
> This is rather annoying than useful.

I can't reproduce here. What is your exact request-pull invocation? Is
request-pull showing a warning like:

  warn: You locally have sound-3.6 but it does not (yet)
  warn: appear to be at 
git://git.kernel.org/pub/scm/linux/kernel/git/tiwai/sound.git
  warn: Do you want to push it there, perhaps?

(it should do so since v1.7.11.2). Maybe we need to make it possible to
bump that warning to a fatal error?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tile: support GENERIC_KERNEL_THREAD and GENERIC_KERNEL_EXECVE

2012-10-23 Thread Jeff King
On Tue, Oct 23, 2012 at 10:47:28PM +0200, Thomas Gleixner wrote:

> I agree that this is a common issue. Acked-by/Reviewed-by mails come
> in after the fact that the patch has been committed to an immutable
> (i.e no-rebase mode) branch or if the change in question already hit
> Linus tree.
> 
> Still it would be nice to have a recording of that in the git tree
> itself.
> 
> Something like: "git --attach SHA1 " would be appreciated!

It is spelled:

  git notes add -m  SHA1

The resulting notes are stored in a separate revision-controlled branch
and can be pushed and pulled like regular refs. Note, though, that the
default refspecs do not yet include refs/notes, so you'd have to add
them manually. The workflows around notes are not very mature yet, so if
you start using them, feedback would be appreciated.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tile: support GENERIC_KERNEL_THREAD and GENERIC_KERNEL_EXECVE

2012-10-23 Thread Jeff King
On Tue, Oct 23, 2012 at 10:09:46PM +0100, Catalin Marinas wrote:

> > It is spelled:
> >
> >   git notes add -m  SHA1
> >
> > The resulting notes are stored in a separate revision-controlled branch
> > and can be pushed and pulled like regular refs. Note, though, that the
> > default refspecs do not yet include refs/notes, so you'd have to add
> > them manually. The workflows around notes are not very mature yet, so if
> > you start using them, feedback would be appreciated.
> 
> What would be nice is that notes are pushed/pulled automatically with
> standard git push/fetch/pull commands. Usually git walks the DAG
> starting with the pulled commit or tag and following the parents. With
> notes, the reference is reversed, the note pointing to the commit and
> not the other way around. So handling this automatically in Git would
> be really useful.

Right, that's what I meant about the refspecs. You can configure git to
push or pull them automatically, but it is not the default. Something
like:

  git config --add remote.origin.fetch '+refs/notes/*:refs/notes/origin/*'

would be a start, but you'd also want to "git notes merge" upstream's
changes into your local notes (you _could_ just fetch straight into
refs/notes/, but if you are making your own notes locally, you have to
resolve it somehow). Exactly how to make this smooth is one of the workflow
considerations; there's been discussion, but most people aren't using
the feature, so we don't have a lot of data.

If you are asking whether we could "auto-follow" notes for commits that
have been fetched like we do for tags, the answer is "not really". The
notes tree is version-controlled as a whole, so you generally fetch the
whole thing or not. And the remote does not advertise note information
the same way we advertise peeled tag references, so a client does not
know which notes are available for fetch. The intended strategy is to
pull in the notes or not (though you can have multiple notes refs with
different names, and fetch just a subset of them).

> The other feature I'd like is that notes are automatically folded in
> the log during git rebase (maybe similar to the squash option). If you
> rebase, you lose all the notes (though this depends on the workflow,
> it may not be needed with published branches).

Git-rebase can automatically copy notes from one commit to another
during a rebase, but you need to set notes.rewriteRef to do so (see "git
help config" for details). The reason for this conservative default is
that some notes may not be appropriate for automatic copying (e.g., a
notes tree containing QA approval should probably be invalidated during
a rebase, whereas one with commentary probably should).

Squashing the notes into the commit message during rebase would be a
useful feature (at least for some type of notes), but that feature does
not currently exist (and as far as I recall, this is the first it has
been proposed).

Again, I think a lot of this comes down to the fact that not many people
are really using notes for their daily workflow, so these itches are not
coming up and getting scratched.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tile: support GENERIC_KERNEL_THREAD and GENERIC_KERNEL_EXECVE

2012-10-23 Thread Jeff King
On Tue, Oct 23, 2012 at 11:25:06PM +0200, Thomas Gleixner wrote:

> > The resulting notes are stored in a separate revision-controlled branch
> 
> Which branch(es) is/are that ? What are the semantics of that?

They are stored in refs/notes/commits by default, but you can have
multiple notes refs if you want to store logically distinct sets of
notes.

A notes ref's tree is just a tree whose entries are sha1s, and the file
contents contain the notes themselves (the sha1s are broken down into
subdirectories for performance, but "git notes" handles this behind the
scenes). Technically you could check it out as a branch, edit, and
commit, but "git checkout" is not happy to have a HEAD outside of
refs/heads/, so you are stuck with plumbing like:

  $ git checkout `git rev-parse refs/notes/commits`
  $ edit edit edit
  $ git commit ...
  $ git update-ref refs/notes/commits HEAD

It's probably not good for much beyond exploring how notes are
implemented. See "git help notes" for more discussion.

> Assume I commit something to branch "foo"
> 
> Now I get that late Ack/Reviewed-by and want to associate that to that
> commit in branch "foo". Does that go into "notes/foo" ?

No. It would go into refs/notes/commits, or you could ask it to go to
refs/notes/acks if you wanted to keep them separate from your default
notes. It is indexed by commit object, not by branch (so if that branch
later goes away, the notes are always still attached to the commit
objects, assuming they got merged in).

> Later when I send a pull request to my upstream maintainer for branch
> "foo" does he get "notes/foo" automagically or do I have to request to
> pull him that separately?

No, he would have to pull your notes separately. Most of the discussion
around sharing has been configuring the default refspec configuration to
fetch notes.  But in the kernel you guys use a lot of one-off pulls
without configured remotes. I'm not sure what the right workflow would
be. It might simply be to ask git to always pull particular notes
commits at the same time (so you might push your notes to
refs/notes/for-linus, and then git would automatically grab the notes
when somebody pulls refs/heads/for-linus).

> Either way is fine for me, though something which lets me "automate"
> that would be appreciated. I can work around that easily as my pull
> requests are generated via scripts, so I can add the secondary one for
> the dependent "notes" branch if necessary. Though it would be nice to
> avoid that. Avoiding that, i.e having a straight connection (maybe
> configurable) between "foo" and "notes/foo" and the commits which have
> not yet hit my upstream maintainer would make my life easier. I.e. I
> just have to check "foo" for stuff which is not upstream yet instead
> of checking both, but that might just be my laziness.
> 
> Thoughts?

That all makes sense. Putting extra work on the puller is not a good
long-term solution. So while sending them an extra "also pull these
notes" line, even if it ends up being a cut-and-pastable single-liner,
is not great (even if it is the most flexible thing). Using a convention
based on name-equivalence seems like a sensible compromise.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] tile: support GENERIC_KERNEL_THREAD and GENERIC_KERNEL_EXECVE

2012-10-23 Thread Jeff King
On Tue, Oct 23, 2012 at 03:06:59PM -0700, Marc Gauthier wrote:

> Can a later commit be eventually be made to reference some set
> of notes added so far, so they become part of the whole history
> signed by the HEAD SHA1?  hence pulled/pushed automatically as
> well.  Otherwise do you not end up with a forever growing separate
> tree of notes that loses some of the properties of being behind
> the head SHA1 (and perhaps less scalable in manageability)?
> Also that way notes are separate only temporarily.

Interesting idea. It would be tough to do with existing objects. There
are really only two ways for a commit to reference objects:

  1. Via a parent header. But we would not want to include the notes
 tree as a separate parent. The semantics are all wrong, and would
 make your commit look like a nonsense merge.

  2. As an entry in a tree. But we do not enforce connectivity of
 commits referenced in trees, because that is the way that
 submodules are implemented.

So I think we would have to add a new header that says "also, here are
some notes for my history". That has two problems:

  1. It's not backwards compatible. People with the new version of git
 will expect to have objects referenced by the new header, but older
 servers may not provide those objects (and vice versa). We can add
 a protocol extension to communicate this, but fundamentally you are
 going to lose the object connection any time it passes through a
 repo running an older git.

  2. It's impure from the perspective of git's data model. Adding in the
 notes reference is not really a property of the commit. It's more
 like saying "Oh, these other things happened to _past_ commits, and
 I'm just now mentioning them". So you pick an arbitrary commit to
 attach it to. What are the semantics with relation to that commit's
 position in the history graph? If I have a commit that is identical
 but without the notes reference, it will have a different sha1. But
 is it the same commit?

So it's a bit ugly. I think I'd rather build out the transfer
infrastructure to pass the notes references around more gracefully
without trying to shoehorn them into the commit graph.

> As for automating the inclusion of notes in the flow, can that
> be conditional on some pattern in the note, so that e.g. the
> Acked-by's get included and folded in automatically, whereas
> others do not, according to settings?

Yeah. You can store arbitrary data in notes (e.g., one of the existing
uses of notes is to record metadata on the patch emails that led to a
commit). Right now you typically separate it out by data type into
separate refs, and then you ask git log to show you particular ones (so
we see refs/notes/commits with "--notes", but you can do "--notes=foo"
to see refs/notes/foo, or even show multiple refs).

For the fold-on-rebase idea, I'd think you would want something similar,
like setting rebase.foldNotes to "foo" to say "refs/notes/foo contains
pseudo-headers that should be folded in like a signed-off-by".

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] Git v1.9-rc0

2014-01-22 Thread Jeff King
On Wed, Jan 22, 2014 at 08:30:30PM +, Ken Moffat wrote:

>  Two questions: Does regenerating (e.g. if the tarball has dropped
> out of the cache) change its sums (md5sum or similar) ?  In (beyond)
> linuxfromscratch we use md5sums to verify that a tarball has not
> changed.

The tarballs we auto-generate from tags are cached, but they can
change if the cached version expires _and_ the archive-generation code
changes.

We use "git archive" to generate the tarballs themselves, and then gzip
the with "gzip -n". So it should be consistent from run to run. However,
very occasionally there are bugfixes in "git archive" which can affect
the output. E.g., commit 22f0dcd (archive-tar: split long paths more
carefully, 2013-01-05) changes the representation of certain long paths,
and generating a tarball with and without it will result in different
checksums (for some repos).

So if you are planning on baking md5sums into a package-build system, it
is much better to point at "official" releases which are rolled once by
the project maintainer, rather than the automatic tag page.

Junio, since you prepare such tarballs[1] anyway for kernel.org, it
might be worth uploading them to the "Releases" page of git/git.  I
imagine there is a programmatic way to do so via GitHub's API, but I
don't know offhand. I can look into it if you are interested.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] Git v1.9-rc0

2014-01-24 Thread Jeff King
On Thu, Jan 23, 2014 at 10:15:33AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > Junio, since you prepare such tarballs[1] anyway for kernel.org, it
> > might be worth uploading them to the "Releases" page of git/git.  I
> > imagine there is a programmatic way to do so via GitHub's API, but I
> > don't know offhand. I can look into it if you are interested.
> 
> I already have a script that takes the three tarballs and uploads
> them to two places, so adding GitHub as the third destination should
> be a natural and welcome way to automate it.

I came up with the script below, which you can use like:

  ./script v1.8.2.3 git-1.8.2.3.tar.gz

It expects the tag to already be pushed up to GitHub.  I'll leave
sticking it on the "todo" branch and integrating it into RelUpload to
you. This can also be used to backfill the old releases (though I looked
on k.org and it seems to have only partial coverage).

It sets the "prerelease" flag for -rc releases, but I did not otherwise
fill in any fields, including the summary and description. GitHub seems
to display reasonably if they are not set.

-- >8 --
#!/bin/sh
#
# usage: $0  

repo=git/git

# replace this with however you store your oauth token
# if you don't have one, make one here:
# https://github.com/settings/tokens/new
token() {
  pass -n github.web.oauth
}

post() {
  curl -H "Authorization: token $(token)" "$@"
}

# usage: create 
create() {
  case "$1" in
  *-rc*)
prerelease=true
;;
  *)
prerelease=false
;;
  esac

  post -d '
  {
"tag_name": "'"$1"'",
"prerelease": '"$prerelease"'
  }' "https://api.github.com/repos/$repo/releases";
}

# use: upload  
upload() {
  url="https://uploads.github.com/repos/$repo/releases/$1/assets"; &&
  url="$url?name=$(basename $2)" &&
  post -H "Content-Type: $(file -b --mime-type "$2")" \
   --data-binary "@$2" \
   "$url"
}

# This is a hack. If you don't mind a dependency on
# perl's JSON (or another parser), we can do a lot better.
extract_id() {
  perl -lne '/"id":\s*(\d+)/ or next; print $1; exit 0'
}

create "$1" >release.json &&
id=$(extract_id /dev/null &&
rm -f release.json
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANNOUNCE] Git v1.9-rc0

2014-01-27 Thread Jeff King
On Mon, Jan 27, 2014 at 02:56:28PM -0800, Junio C Hamano wrote:

> > # replace this with however you store your oauth token
> > # if you don't have one, make one here:
> > # https://github.com/settings/tokens/new
> > token() {
> >   pass -n github.web.oauth
> 
> Hmph, what is this "pass" thing?

It's a poor man's Keychain:

  https://github.com/peff/pass

Judging from your use of netrc in Meta/RelUpload, you probably just
want:

  token() {
cat ~/.github-token
  }

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] commit: Add -f, --fixes option to add Fixes: line

2013-10-28 Thread Jeff King
On Mon, Oct 28, 2013 at 12:29:32PM +0100, Johan Herland wrote:

> > A hook-based solution could do this.  But a built-in "all-purpose"
> > handler like "footer.Fixes.arg=commit", which was intended to be
> > reusable, wouldn't be able to do such footer-specific extra work without
> > having to create new special cases in git each time.
> 
> Which begs the question (posed to all, not specifically to you): Why
> would we want solve this issue in config instead of in hooks? The
> hooks will always be more flexible and less dependent on making
> changes in git.git. (...a suitably flexible hook could even use the
> config options discussed above as input...) In both cases, we need the
> user to actively enable the functionality (either installing hooks, or
> setting up config), and in both cases we could bundle Git with
> defaults that solve the common cases, so that is not a useful
> differentiator between the two approaches. I would even venture to
> ask: If we end up solving this problem in config and not in hooks,
> then why do we bother having hooks in the first place?

One thing that is much nicer with config vs hooks is that you can manage
config for all of your repositories by tweaking ~/.gitconfig (and that
is where I would expect this type of config to go).

Managing hooks globally means having each repo symlink to a central hook
area, and having the forethought to set up the symlink farm and use
init.templatedir before cloning any repos.  We could probably make this
friendlier by reading from ~/.githooks and defining some semantics for
multiple hooks. E.g., fall back to ~/.githooks if the repo hook is not
executable, or possibly run them both (or even allow multiple instances
of a hook in ~/.githooks, which can help organization), and consider the
hook a failure if any of them fail.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] commit: Add -f, --fixes option to add Fixes: line

2013-10-28 Thread Jeff King
On Mon, Oct 28, 2013 at 11:10:13PM +0100, Thomas Rast wrote:

> * In your list
> 
> >   Fixes:
> >   Reported-by:
> >   Suggested-by:
> >   Improved-by:
> >   Acked-by:
> >   Reviewed-by:
> >   Tested-by:
> >   Signed-off-by:
> 
>   and I might add
> 
> Cherry-picked-from:
> Reverts:
> 
>   if one were to phrase that as a footer/pseudoheader, observe that
>   there are only two kinds of these: footers that contain identities,
>   and footers that contain references to commits.

I think people put other things in, too. For example, cross-referencing
bug-tracker ids.

In fact, if I saw "fixes: XXX", I would expect the latter to be a
tracker id.  People do this a lot with GitHub issues, because GitHub
will auto-close issue 123 if a commit with "fixes #123" is pushed to
master. Because of the "#", no pseudo-header is needed, but I have also
seen people use the footer style (I don't have any examples on-hand,
though).

That being said, in your examples:

> So why not support these use-cases?  We could have something like
> footer.foo.* configuration, e.g.
> 
> [footer "fixes"]
> type = commit
> suggest = true
> [footer "acked-by"]
> type = identity

you could easily have "type=text" to handle arbitrary text.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: project wide: git config entry for [diff] renames=true

2014-09-25 Thread Jeff King
On Thu, Sep 25, 2014 at 08:48:31AM -0700, Joe Perches wrote:

> On Thu, 2014-09-25 at 17:03 +0200, Greg Kroah-Hartman wrote:
> 
> > In the future, please generate a git "move" diff, which makes it easier
> > to review, and prove that nothing really changed.  It also helps if the
> > file is a bit different from what you diffed against, which in my case,
> > was true.
> 
> Maybe it'd be possible to add 
> 
> [diff]
>   renames = true
> 
> to the .git/config file.
> 
> but I don't find a mechanism to add anything to the
> .git/config and have it be pulled.

There is no such mechanism within git. We've resisted adding one because
of the danger of something like:

  [diff]
external = rm -rf /

diff.renames is probably safe, but any config-sharing mechanism would
have to deal with either whitelisting, or providing some mechanism for
the puller to review changes before blindly following them.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] apply: refuse touching a file beyond symlink

2015-01-30 Thread Jeff King
On Thu, Jan 29, 2015 at 12:45:22PM -0800, Junio C Hamano wrote:

> +static int path_is_beyond_symlink(const char *name_)
> +{
> + struct strbuf name = STRBUF_INIT;
> +
> + strbuf_addstr(&name, name_);
> + do {
> + struct patch *previous;
> +
> + while (--name.len && name.buf[name.len] != '/')
> + ; /* scan backwards */
> + if (!name.len)
> + break;

I imagine it is impossible here for "name_" to be initially empty, but
it would make the backwards-scan loop go quite badly. Worth a comment or
an assert()?

> + name.buf[name.len] = '\0';
> + previous = in_fn_table(name.buf);
> + if (previous) {
> + if (!was_deleted(previous) &&
> + !to_be_deleted(previous) &&
> + previous->new_mode &&
> + S_ISLNK(previous->new_mode))
> + goto symlink_found;
> + } else if (check_index) {
> + int pos = cache_name_pos(name.buf, name.len);
> + if (0 <= pos &&
> + S_ISLNK(active_cache[pos]->ce_mode))
> + goto symlink_found;
> + } else {
> + struct stat st;
> + if (!lstat(name.buf, &st) && S_ISLNK(st.st_mode))
> + goto symlink_found;
> + }
> + } while (1);
> +
> + strbuf_release(&name);
> + return 0;
> +symlink_found:
> + strbuf_release(&name);
> + return 1;

Style nit, but might this be easier to follow the logic without the
gotos, by putting the setup and cleanup in a wrapper function and
returning directly from the main logic?

  static int path_is_beyond_symlink(const char *name)
  {
struct strbuf buf = STRBUF_INIT;
int ret;

strbuf_addstr(&buf, name);
ret = path_is_beyond_symlink_1(name);
strbuf_release(&buf);

return ret;
  }

I can live with it either way, though.

> + if (!patch->is_delete && path_is_beyond_symlink(patch->new_name))
> + return error(_("affected file '%s' is beyond a symbolic link"),
> +  patch->new_name);

Why does this not kick in when deleting a file? If it is not OK to
add across a symlink, why is it OK to delete? IOW, why should this test
fail:

diff --git a/t/t4122-apply-symlink-inside.sh b/t/t4122-apply-symlink-inside.sh
index 0a8de4a..f03b604 100755
--- a/t/t4122-apply-symlink-inside.sh
+++ b/t/t4122-apply-symlink-inside.sh
@@ -64,6 +64,7 @@ test_expect_success SYMLINKS 'do not follow symbolic link 
(setup)' '
>arch/x86_64/dir/file &&
git add arch/x86_64/dir/file &&
git diff HEAD >add_file.patch &&
+   git diff -R HEAD >del_file.patch &&
git reset --hard &&
rm -fr arch/x86_64/dir &&
 
@@ -111,7 +112,11 @@ test_expect_success SYMLINKS 'do not follow symbolic link 
(existing)' '
 
test_must_fail git apply --cached add_file.patch 2>error-ct-file &&
test_i18ngrep "beyond a symbolic link" error-ct-file &&
-   test_must_fail git ls-files --error-unmatch arch/i386/dir
+   test_must_fail git ls-files --error-unmatch arch/i386/dir &&
+
+   >arch/i386/dir/file &&
+   test_must_fail git apply del_file.patch &&
+   test_path_is_file arch/i386/dir/file
 '
 
 test_done

> + test ! -e arch/x86_64/dir &&
> + test ! -e arch/i386/dir/file &&

Minor nit: use test_path_is_missing here (and elsewhere in the added
tests).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/1] apply: reject input that touches outside $cwd

2015-01-30 Thread Jeff King
On Thu, Jan 29, 2015 at 03:48:14PM -0800, Junio C Hamano wrote:

> By default, a patch that affects outside the working area is
> rejected as a mistake; Git itself never creates such a patch
> unless the user bends backwards and specifies nonstandard
> prefix to "git diff" and friends.
> 
> When `git apply` is used without either `--index` or `--cached`
> option as a "better GNU patch", the user can pass `--allow-uplevel`
> option to override this safety check.  This cannot be used to escape
> outside the working tree when using `--index` or `--cached` to apply
> the patch to the index.

It looks like your new --allow-uplevel goes to verify_path(). So this
isn't just about "..", but it will also protect against applying a patch
inside ".git". Which seems like a good thing to me, but I wonder if the
option name is a little misleading. It is really about applying the same
checks we do for index paths to the non-index mode of "git apply".

>  * Meant to apply on top of the previous one, but these two are
>about separate and orthogonal issues.

I agree they are orthogonal in concept, though I doubt the symlink tests
here would pass without the previous one (since verify_path does not
know or care about crossing symlink boundaries).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/1] apply: reject input that touches outside $cwd

2015-01-30 Thread Jeff King
On Fri, Jan 30, 2015 at 11:07:34AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > It looks like your new --allow-uplevel goes to verify_path(). So this
> > isn't just about "..", but it will also protect against applying a patch
> > inside ".git". Which seems like a good thing to me, but I wonder if the
> > option name is a little misleading.
> 
> True; not just misleading but is incorrect, I would say.
> Suggestions?

I think just "--verify-paths" (and "--no-verify-paths", since the former
would be the default) might be fine. That leaves the definition of
"verify" vague, but I think that's OK. It used to mean "no '..' and no
'.git'", and now it has been widened to include "no weird
filesystem-specific variants of .git".

If you wanted to avoid the negative being the commonly used option,
maybe "--unsafe-paths" (or "--allow-unsafe-paths" if you like verbs).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] apply: refuse touching a file beyond symlink

2015-01-30 Thread Jeff King
On Fri, Jan 30, 2015 at 11:42:49AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > On Thu, Jan 29, 2015 at 12:45:22PM -0800, Junio C Hamano wrote:
> >
> >> +  if (!patch->is_delete && path_is_beyond_symlink(patch->new_name))
> >> +  return error(_("affected file '%s' is beyond a symbolic link"),
> >> +   patch->new_name);
> >
> > Why does this not kick in when deleting a file?
> 
> Half-written logic, forgotten to be revisited (i.e. "ok, anything
> that is not delete we can check new_name, so do that first, later
> we'd deal with deletion patch and I think the way to do so is by
> checking old_name, but let's make sure this case works first").

OK, I was worried I was missing something clever. :)

I agree that checking patch->old_name should work in that case.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] apply: refuse touching a file beyond symlink

2015-01-30 Thread Jeff King
On Fri, Jan 30, 2015 at 11:48:14AM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> >> +  if (!patch->is_delete && path_is_beyond_symlink(patch->new_name))
> >> +  return error(_("affected file '%s' is beyond a symbolic link"),
> >> +   patch->new_name);
> >
> > Why does this not kick in when deleting a file? If it is not OK to
> > add across a symlink, why is it OK to delete?
> 
> Hmph, adding
> 
>   if (patch->is_delete && path_is_beyond_symlink(patch->old_name))
>   return error(_("deleted file '%s' is beyond a symlink"),
>   patch->old_name);
> 
> seems to break t4114.11, which wants to apply this patch to a tree
> that does not have a symbolic link but a directory at 'foo/'.
> 
> diff --git a/foo b/foo
> new file mode 12
> index 000..ba0e162
> --- /dev/null
> +++ b/foo
> @@ -0,0 +1 @@
> +bar
> \ No newline at end of file
> diff --git a/foo/baz b/foo/baz
> deleted file mode 100644
> index 682c76b..000
> --- a/foo/baz
> +++ /dev/null
> @@ -1 +0,0 @@
> -if only I knew

Hrm. That only works in the current code because we apply the deletion
in the directory (and then clean up the now-empty directory) first. So I
think you would need to check the paths progressively as you apply them,
since those other parts of the diff "haven't happened yet".

If we take deletion as one phase and addition as another, I think you
could get away with:

diff --git a/builtin/apply.c b/builtin/apply.c
index f5491cd..12c9d8e 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3549,7 +3549,7 @@ static int check_to_create(const char *new_name, int 
ok_if_exists)
return 0;
 }
 
-static int path_is_beyond_symlink(const char *name_)
+static int path_is_beyond_symlink(const char *name_, int check_table)
 {
struct strbuf name = STRBUF_INIT;
 
@@ -3562,7 +3562,8 @@ static int path_is_beyond_symlink(const char *name_)
if (!name.len)
break;
name.buf[name.len] = '\0';
-   previous = in_fn_table(name.buf);
+
+   previous = check_table ? in_fn_table(name.buf) : NULL;
if (previous) {
if (!was_deleted(previous) &&
!to_be_deleted(previous) &&
@@ -3676,9 +3677,12 @@ static int check_patch(struct patch *patch)
}
}
 
-   if (!patch->is_delete && path_is_beyond_symlink(patch->new_name))
+   if (!patch->is_delete && path_is_beyond_symlink(patch->new_name, 1))
return error(_("affected file '%s' is beyond a symbolic link"),
 patch->new_name);
+   if (patch->is_delete && path_is_beyond_symlink(patch->old_name, 0))
+   return error(_("affected file '%s' is beyond a symbolic link"),
+patch->old_name);
 
if (apply_data(patch, &st, ce) < 0)
return error(_("%s: patch does not apply"), name);


but I suspect we could construct a case that depends more closely on the
order of application. E.g., a patch that does:

  1. add foo as a symlink
  2. add file foo/bar

is definitely wrong. But is:

  1. add file foo/bar
  2. add foo as a symlink

does not technically fall afoul of the symlink rules. It is a _bogus_
patch, of course, because the second part will get a D/F conflict. I am
not sure if there are any legitimate patches that could run into this
ordering problem, but even without it, it smells a bit funny to complain
for the wrong reason.

Looking at the code, though, it seems like we should be doing these
checks progressively as we add entries to the fn_table. So that is doing
the right thing. It is only the deletion re-ordering that trips us up.
Can we reorder all deletions before all additions before calling
check_patch on each?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] apply: refuse touching a file beyond symlink

2015-01-30 Thread Jeff King
On Fri, Jan 30, 2015 at 12:11:30PM -0800, Junio C Hamano wrote:

> I am not sure how to fix this, without completely ripping out the
> misguided "We should be able to concatenate outputs from multiple
> invocations of 'git diff' into a single file and apply the result
> with a single invocation of 'git apply'" change I grudgingly
> accepted long time ago (7a07841c (git-apply: handle a patch that
> touches the same path more than once better, 2008-06-27).
> 
> "git diff" output is designed each patch to apply independently to
> the preimage to produce the postimage, and that allows patches to
> two files can be swapped via -Oorderfile mechanism, and also "X was
> created by copying from Y and Y is modified in place" will result in
> X with the contents of Y in the preimage (i.e. before the in-place
> modification of Y in the same patch) regardless of the order of X
> and Y in the "git diff" output.  The above input used by t4114.11
> expects to remove 'foo/baz' (leaving an empty directory foo as an
> result but we do not track directories so it can be nuked to make
> room if other patch in the same input wants to put something else,
> either a regular file or a symbolic link, there) and create a blob
> at 'foo', and such an input should apply regardless of the order of
> patches in it.
> 
> The in_fn_table[] stuff broke that design completely.

I had the impression that we did not apply in any arbitrary order that
could work, but rather that we did deletions first followed by
additions. But I am fairly ignorant of the apply code.

If that assumption is correct, then I think we could just follow the
same phases that the actual application does. Here's a hacky version
below. Probably the check of phase versus is_delete needs to be better
(and ideally the logic would be factored out of write_one_result so they
always match).

diff --git a/builtin/apply.c b/builtin/apply.c
index f5491cd..85364b8 100644
--- a/builtin/apply.c
+++ b/builtin/apply.c
@@ -3593,7 +3593,7 @@ symlink_found:
  * Check and apply the patch in-core; leave the result in patch->result
  * for the caller to write it out to the final destination.
  */
-static int check_patch(struct patch *patch)
+static int check_patch(struct patch *patch, int phase)
 {
struct stat st;
const char *old_name = patch->old_name;
@@ -3604,6 +3604,9 @@ static int check_patch(struct patch *patch)
int ok_if_exists;
int status;
 
+   if (!phase != patch->is_delete)
+   return 0;
+
patch->rejected = 1; /* we will drop this after we succeed */
 
status = check_preimage(patch, &ce, &st);
@@ -3679,6 +3682,9 @@ static int check_patch(struct patch *patch)
if (!patch->is_delete && path_is_beyond_symlink(patch->new_name))
return error(_("affected file '%s' is beyond a symbolic link"),
 patch->new_name);
+   if (patch->is_delete && path_is_beyond_symlink(patch->old_name))
+   return error(_("affected file '%s' is beyond a symbolic link"),
+patch->old_name);
 
if (apply_data(patch, &st, ce) < 0)
return error(_("%s: patch does not apply"), name);
@@ -3686,7 +3692,7 @@ static int check_patch(struct patch *patch)
return 0;
 }
 
-static int check_patch_list(struct patch *patch)
+static int check_patch_list_1(struct patch *patch, int phase)
 {
int err = 0;
 
@@ -3695,12 +3701,22 @@ static int check_patch_list(struct patch *patch)
if (apply_verbosely)
say_patch_name(stderr,
   _("Checking patch %s..."), patch);
-   err |= check_patch(patch);
+   err |= check_patch(patch, phase);
patch = patch->next;
}
return err;
 }
 
+static int check_patch_list(struct patch *patch)
+{
+   int err = 0;
+   int phase;
+
+   for (phase = 0; phase < 2; phase++)
+   err |= check_patch_list_1(patch, phase);
+   return err;
+}
+
 /* This function tries to read the sha1 from the current index */
 static int get_current_sha1(const char *path, unsigned char *sha1)
 {
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] apply: refuse touching a file beyond symlink

2015-01-30 Thread Jeff King
On Fri, Jan 30, 2015 at 12:20:02PM -0800, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > I had the impression that we did not apply in any arbitrary order that
> > could work, but rather that we did deletions first followed by
> > additions. But I am fairly ignorant of the apply code.
> 
> No, you are thinking about the write-out of the finished result,
> which may have to turn existing directory to a file or vice versa on
> the filesystem, but that happens _after_ we decide what to turn into
> what else, completely in-core.
> 
> And the decision to determine what the input _means_ should not
> depend on the order of patches in the input.

Ah, OK. Yeah, doing it progressively can only be accurate if our
name-checks follow the same order as applying, because we are checking
against a particular state.

But could we instead pull this check to just before the write-out time?
That is, to let any horrible thing happen in-core, as long as what we
write out to the index and the filesystem is sane?

-Peff
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Re* [ANNOUNCE] Git v2.27.0-rc1

2020-05-20 Thread Jeff King
On Wed, May 20, 2020 at 01:31:30PM -0700, Junio C Hamano wrote:

> The promotion in Git 2.26 was buried in the "performance &
> implementation details" section and not in the backward
> compatibility section, so it feels a bit funny to highlight the
> reversion.  In any case, here is what I prepared (but not committed
> yet)

Hmm, yeah, that does make it feel funny. I'd argue that it probably
would have been worth promoting a bit more in v2.26, but it is obviously
too late now.

> -- >8 --
> Subject: [PATCH] RelNotes: advertise the demotion of protocol v2

What you have here looks fine, but I'd be OK if we just left it as-is,
too, given the v2.26 mention.

-Peff


Re: [ANNOUNCE] Git v2.27.0-rc1

2020-05-20 Thread Jeff King
On Wed, May 20, 2020 at 12:17:11PM -0700, Junio C Hamano wrote:

> Git 2.27 Release Notes (draft)
> ==
> 
> Updates since v2.26
> ---
> 
> Backward compatibility notes

Is it worth mentioning here the reversion of v2 as the default protocol?

It does end up (along with the actual code fix) in the "fixes from
v2.26" section:

>  * Those fetching over protocol v2 from linux-next and other kernel
>repositories are reporting that v2 often fetches way too much than
>needed.
>(merge 11c7f2a30b jn/demote-proto2-from-default later to maint).
> 
>  * The upload-pack protocol v2 gave up too early before finding a
>common ancestor, resulting in a wasteful fetch from a fork of a
>project.  This has been corrected to match the behaviour of v0
>protocol.
>(merge 2f0a093dd6 jt/v2-fetch-nego-fix later to maint).

but that's somewhat buried. I dunno. It is not likely to introduce _new_
compatibility issues, but perhaps folks looking into compatibility stuff
may want to know about the revert.

-Peff


Re: [Breakage] Git v2.21.0-rc0 - t5318 (NonStop)

2019-02-08 Thread Jeff King
On Fri, Feb 08, 2019 at 06:08:33AM -0500, Randall S. Becker wrote:

> t5318 is rather problematic and I have no good way to fix this. There
> is no /dev/zero on the platform, and the corrupt_graph_and_verify
> hard-codes if=/dev/zero, which is a linux-specific pseudo device.
> Please provide a more platform independent way of testing this
> feature. Pretty much all subtests from 46 onward fail as a result.

We did discuss this at the time of the patch, but it seems we already
use /dev/zero in a bunch of places:

  https://public-inbox.org/git/xmqqbm57rkg5@gitster-ct.c.googlers.com/

Were you just skipping the other tests before?

-Peff


Re: [Breakage] Git v2.21.0-rc0 - t5318 (NonStop)

2019-02-08 Thread Jeff King
On Fri, Feb 08, 2019 at 12:49:59PM -0500, Randall S. Becker wrote:

> > We did discuss this at the time of the patch, but it seems we already use
> > /dev/zero in a bunch of places:
> > 
> >   https://public-inbox.org/git/xmqqbm57rkg5@gitster-ct.c.googlers.com/
> > 
> > Were you just skipping the other tests before?
> 
> I did not catch the implications of the review at the time - my bad. We were 
> not intentionally skipping the tests. It looks like some are automatically 
> skipped. t4153 automatically skips (missing TTY), and t5562 fails also but 
> for a different reason (hang - we don't have apache2 to serve up http 
> content).
> 
> Would you object to something like this:
> 
> if [ ! -e /dev/zero ]; then
>   # use shred or some other mechanism (still trying to figure out a 
> solution)
> else
>   # existing dd
> fi

That's fine, as long as it's wrapped up in a function in order to keep
the tests readable.

Though I suspect we may be able to just find a solution that works
everywhere, without having two different implementations. If we know we
need $count bytes for dd, we could probably just generate a file with
that many NULs in it.

Other cases don't seem to actually care that they're getting NULs, and
are just redirecting stdin from /dev/zero to get an infinite amount of
input. They could probably use "yes" for that.

-Peff


Re: [Breakage] Git v2.21.0-rc0 - t5318 (NonStop)

2019-02-08 Thread Jeff King
On Fri, Feb 08, 2019 at 01:47:04PM -0500, Randall S. Becker wrote:

> > Though I suspect we may be able to just find a solution that works
> > everywhere, without having two different implementations. If we know we
> > need $count bytes for dd, we could probably just generate a file with that
> > many NULs in it.
> 
> For this, we could use truncate -s count file instead of dd to get a
> fixed size file of nulls. This would remove the need for /dev/zero in
> t5318 (the patch below probably will wrap badly in my mailer so I can
> submit a real patch separately.

I don't think "truncate" is portable, though.

> > Other cases don't seem to actually care that they're getting NULs, and are
> > just redirecting stdin from /dev/zero to get an infinite amount of input. 
> > They
> > could probably use "yes" for that.
> 
> What about reading from /dev/null?

That would yield zero bytes, not an infinite number of them.

-Peff


Re: [Breakage] Git v2.21.0-rc0 - t5318 (NonStop)

2019-02-08 Thread Jeff King
On Fri, Feb 08, 2019 at 02:26:17PM -0500, Randall S. Becker wrote:

> > > For this, we could use truncate -s count file instead of dd to get a
> > > fixed size file of nulls. This would remove the need for /dev/zero in
> > > t5318 (the patch below probably will wrap badly in my mailer so I can
> > > submit a real patch separately.
> > 
> > I don't think "truncate" is portable, though.
> 
> It is available AFAIK on Linux, POSIX, and Windows under Cygwin.
> That's more than /dev/zero has anyway. I have the patch ready if you
> want it.

Is it POSIX? Certainly truncate() is, but I didn't think the
command-line tool was. If it really is available everywhere, then yeah,
I'd be fine with it.

> > > > Other cases don't seem to actually care that they're getting NULs,
> > > > and are just redirecting stdin from /dev/zero to get an infinite
> > > > amount of input. They could probably use "yes" for that.
> > >
> > > What about reading from /dev/null?
> > 
> > That would yield zero bytes, not an infinite number of them.
> 
> So something like: yes | tr 'y' '\0' | stuff?

Exactly (if we even care about them being NULs; otherwise, we can omit
the "tr" invocation).

-Peff


Re: [Breakage] Git v2.21.0-rc0 - t5318 (NonStop)

2019-02-08 Thread Jeff King
On Fri, Feb 08, 2019 at 05:12:43PM -0500, Randall S. Becker wrote:

> On February 8, 2019 17:07, brian m. carlson wrote:
> > On Fri, Feb 08, 2019 at 02:31:57PM -0500, Jeff King wrote:
> > > > It is available AFAIK on Linux, POSIX, and Windows under Cygwin.
> > > > That's more than /dev/zero has anyway. I have the patch ready if you
> > > > want it.
> > >
> > > Is it POSIX? Certainly truncate() is, but I didn't think the
> > > command-line tool was. If it really is available everywhere, then
> > > yeah, I'd be fine with it.
> > 
> > It's not. POSIX doesn't specify the command, and macOS lacks it, I believe.
> 
> I'm happy to modify the test (it is in one spot), to make a decision based on:
> a) whether /dev/zero exists
> b) whether the system is a NonStop
> c) something else
> 
> What would you all prefer? It doesn't matter to me one way or another,
> as long as I can get the dependency to /dev/zero removed so tests will
> run here.

For the case in t5318, I think we can just put the NULs in a file. Does
this work on your platform?

---
diff --git a/t/t5318-commit-graph.sh b/t/t5318-commit-graph.sh
index 16d10ebce8..6d0ccc7eba 100755
--- a/t/t5318-commit-graph.sh
+++ b/t/t5318-commit-graph.sh
@@ -383,7 +383,8 @@ corrupt_graph_and_verify() {
cp $objdir/info/commit-graph commit-graph-backup &&
printf "$data" | dd of="$objdir/info/commit-graph" bs=1 seek="$pos" 
conv=notrunc &&
dd of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" count=0 &&
-   dd if=/dev/zero of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" 
count=$(($orig_size - $zero_pos)) &&
+   gen_zero_bytes $(($orig_size - $zero_pos)) >zeroes &&
+   dd if=zeroes of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" &&
test_must_fail git commit-graph verify 2>test_err &&
grep -v "^+" test_err >err &&
test_i18ngrep "$grepstr" err
diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh
index 92cf8f812c..4afab14431 100644
--- a/t/test-lib-functions.sh
+++ b/t/test-lib-functions.sh
@@ -1302,3 +1302,8 @@ test_set_port () {
port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0}))
eval $var=$port
 }
+
+# Generate an output of $1 bytes of all zeroes (NULs, not ASCII zeroes).
+gen_zero_bytes () {
+   perl -e 'print "\0" x $ARGV[0]' "$@"
+}

For the others that need infinite zeroes, I think using "yes" makes more
sense, though we could also teach this function to accept an "infinity"
parameter.

-Peff


Re: [Breakage] Git v2.21.0-rc0 - t5318 (NonStop)

2019-02-08 Thread Jeff King
On Fri, Feb 08, 2019 at 05:53:53PM -0500, Randall S. Becker wrote:

> > diff --git a/t/test-lib-functions.sh b/t/test-lib-functions.sh index
> > 92cf8f812c..4afab14431 100644
> > --- a/t/test-lib-functions.sh
> > +++ b/t/test-lib-functions.sh
> > @@ -1302,3 +1302,8 @@ test_set_port () {
> > port=$(($port + ${GIT_TEST_STRESS_JOB_NR:-0}))
> > eval $var=$port
> >  }
> > +
> > +# Generate an output of $1 bytes of all zeroes (NULs, not ASCII zeroes).
> > +gen_zero_bytes () {
> > +   perl -e 'print "\0" x $ARGV[0]' "$@"
> > +}
> 
> This function does work on platform, so it's good.

Great. Since it sounds like you're preparing some patches to deal with
/dev/zero elsewhere, do you want to wrap it up in a patch as part of
that?

-Peff


Re: [Breakage] Git v2.21.0-rc0 - t5318 (NonStop)

2019-02-09 Thread Jeff King
On Sat, Feb 09, 2019 at 09:39:43AM +0100, Johannes Sixt wrote:

> > Great. Since it sounds like you're preparing some patches to deal with
> > /dev/zero elsewhere, do you want to wrap it up in a patch as part of
> > that?
> 
> Please do not use yes to generate an infinite amount of bytes. Our
> implementation of yes() in test-lib.sh generates only 99 lines.

Ah, thanks. That doesn't matter here, but it would for the other patches
under discussion.

> Perhaps do this.
> [...]
>   dd of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" count=0 &&
> - dd if=/dev/zero of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" 
> count=$(($orig_size - $zero_pos)) &&
> + printf "%0*d" $(($orig_size - $zero_pos)) 0 | tr 0 '\0' |
> + dd of="$objdir/info/commit-graph" bs=1 seek="$zero_pos" &&

Using stdin instead of the tmpfile is nice, and shouldn't have any
problems. I do think your printf suggestion looks nice, but I wondered
if it might run into portability issues (not because of anything in
particular, but I often find that the more clever a shell solution, the
more likely we run into obscure problems).

But if it works everywhere, that's fine by me.

-Peff


Re: [RFC for GIT] pull-request: add praise to people doing QA

2017-01-19 Thread Jeff King
On Thu, Jan 19, 2017 at 09:43:45PM +0100, Wolfram Sang wrote:

> > As to the implementation, I am wondering if we can make this somehow
> > work well with the "trailers" code we already have, instead of
> > inventing yet another parser of trailers.  
> > 
> > In its current shape, "interpret-trailers" focuses on "editing" an
> > existing commit log message to tweak the trailer lines.  That mode
> > of operation would help amending and rebasing, and to do that it
> > needs to parse the commit log message, identify trailer blocks,
> > parse out each trailer lines, etc.  
> > 
> > There is no fundamental reason why its output must be an edited
> > original commit log message---it should be usable as a filter that
> > picks trailer lines of the selected trailer type, like "Tested-By",
> > etc.
> 
> I didn't know about trailers before. As I undestand it, I could use
> "Tested-by" as the key, and the commit subject as the value. This list
> then could be parsed and brought into proper output shape. It would
> simplify the subject parsing, but most things my AWK script currently
> does would still need to stay or to be reimplemented (extracting names
> from tags, creating arrays of tags given by $name). Am I correct?
> 
> All under the assumption that trailers work on a range of commits. I
> have to admit that adding this to git is beyond my scope.

This sounds a lot like the shortlog-trailers work I did about a year
ago:

  http://public-inbox.org/git/20151229073832.gn8...@sigill.intra.peff.net/

  http://public-inbox.org/git/20151229075013.ga9...@sigill.intra.peff.net/

Nobody seemed to really find it useful, so I didn't pursue it.

Some of the preparatory patches in that series bit-rotted in the
meantime, but you can play with a version based on v2.7.0 by fetching
the "shortlog-trailers-historical" branch from
https://github.com/peff/git.git.

And then things like:

  git shortlog --ident=tested-by --format='...tested a patch by %an'

work (and you can put whatever commit items you want into the --format,
including just dumping the hash if you want to do more analysis).

-Peff


Re: [char-misc-next] mei: request async autosuspend at the end of enumeration

2016-11-24 Thread Jeff King
On Thu, Nov 24, 2016 at 04:10:49PM +, Winkler, Tomas wrote:

> > Cc:  # 4.4+
> 
> Looks like git send-email is not able to parse this address correctly
> though this is suggested format by Documentation/stable_kernel_rules.txt.
> Create wrong address If git parsers is used : 'sta...@vger.kernel.org#4.4+' 
> 
> Something like s/#.*$// is needed before parsing Cc:

This should be fixed by e3fdbcc8e (parse_mailboxes: accept extra text
after <...> address, 2016-10-13), which will be released next week as
part of v2.11. As a workaround, you can also install the Mail::Address
perl module.  See [1] for the gory details.

-Peff

[1] 
http://public-inbox.org/git/41164484-309b-bfff-ddbb-55153495d...@lwfinger.net/


Re: [char-misc-next] mei: request async autosuspend at the end of enumeration

2016-11-24 Thread Jeff King
On Thu, Nov 24, 2016 at 10:37:14PM +, Winkler, Tomas wrote:

> > > > Cc:  # 4.4+
> > >
> > > Looks like git send-email is not able to parse this address correctly
> > > though this is suggested format by Documentation/stable_kernel_rules.txt.
> > > Create wrong address If git parsers is used : 
> > > 'sta...@vger.kernel.org#4.4+'
> > >
> > > Something like s/#.*$// is needed before parsing Cc:
> > 
> > This should be fixed by e3fdbcc8e (parse_mailboxes: accept extra text after
> > <...> address, 2016-10-13), which will be released next week as part of 
> > v2.11. As
> > a workaround, you can also install the Mail::Address perl module.  See [1] 
> > for
> > the gory details.
> 
> Thanks for update, I failed to understand from the thread though what
> decision was actually applied, I will look at the patch itself.  I've
> tried to install Mail::Address and it fixes the actual address, but it
> appends  the # 4.4+ suffix into the name  as also mentioned in the
> thread, which doesn't fit the stable rules doc.

The patch just brings parity to the Mail::Address behavior and git's
fallback parser, so that you don't end up with the broken
sta...@vger.kernel.org#4.4+ address. Instead, that content goes into the
name part of the address.

It sounds like you want the "# 4.4+" to be dropped entirely in the
rfc822 header. It looks like send-email used to do that, but stopped in
b1c8a11c8 (send-email: allow multiple emails using --cc, --to and --bcc,
2015-06-30).

So perhaps there are further fixes required, but it's hard to know. The
input isn't a valid rfc822 header, so it's not entirely clear what the
output is supposed to be. I can buy either "drop it completely" or
"stick it in the name field of the cc header" as reasonable.

-Peff


Re: git email From: parsing (was Re: [GIT PULL] Staging/IIO driver patches for 4.11-rc1)

2017-02-22 Thread Jeff King
On Thu, Feb 23, 2017 at 07:04:44AM +0100, Greg KH wrote:

> > Poor Simon Sandström.
> > 
> > Funnily enough, this only exists for one commit. You've got several
> > other commits from Simon that get his name right.
> > 
> > What happened?
> 
> I don't know what happened, I used git for this, I don't use quilt for
> "normal" patches accepted into my trees anymore, only for stable kernel
> work.
> 
> So either the mail is malformed, or git couldn't figure it out, I've
> attached the original message below, and cc:ed the git mailing list.
> 
> Also, Simon emailed me after this was committed saying something went
> wrong, but I couldn't go back and rebase my tree.  Simon, did you ever
> figure out if something was odd on your end?
> 
> Git developers, any ideas?

The problem isn't on the applying end, but rather on the generating end.
The From header in the attached mbox is:

  From: =?us-ascii?B?PT9VVEYtOD9xP1NpbW9uPTIwU2FuZHN0cj1DMz1CNm0/PQ==?= 


If you de-base64 that, you get:

  =?UTF-8?q?Simon=20Sandstr=C3=B6m?=

So something double-encoded it before it got to your mbox.

-Peff


Re: git email From: parsing (was Re: [GIT PULL] Staging/IIO driver patches for 4.11-rc1)

2017-02-24 Thread Jeff King
On Fri, Feb 24, 2017 at 12:03:45PM +0100, Geert Uytterhoeven wrote:

> > The problem isn't on the applying end, but rather on the generating end.
> > The From header in the attached mbox is:
> >
> >   From: =?us-ascii?B?PT9VVEYtOD9xP1NpbW9uPTIwU2FuZHN0cj1DMz1CNm0/PQ==?= 
> > 
> 
> Slightly related, once in a while I get funny emails through
> git-commits-h...@vger.kernel.org, where the subject is completely screwed up:
> 
> Subject: \x64\x72\x6D\x2F\x74\x69\x6E\x79\x64\x72\x6D\x3A
> \x6D\x69\x70\x69\x2D\x64\x62\x69\x3A \x53\x69\x6C\x65\x6E\x63\x65\x3A
> ‘\x63\x6D\x64’ \x6D\x61\x79 \x62\x65

Sorry, I don't have a clue on that one.

If you have UTF-8 or other non-ASCII characters in your subject,
format-patch will correctly do the rfc2047 encoding (and it should
always use QP). And that would kick in here because of the UTF-8 quotes.

But that weird "\x" encoding is not in any mail standard I know of (and
certainly Git would never do it).

The odd thing is that the quotes themselves _aren't_ encoded. Just
everything else.

One other feature is that subject line is long enough (especially
QP-encoded) that it spans two lines:

  $ git format-patch --stdout -1 b401f34314d | grep -A1 ^Subject
  Subject: [PATCH] =?UTF-8?q?drm/tinydrm:=20mipi-dbi:=20Silence:=20=E2=80=98?=
   =?UTF-8?q?cmd=E2=80=99=20may=20be=20used=20uninitialized?=

It's possible that something along the way is mis-handling subjects with
line-continuation (though why it would escape those characters, I don't
know).

> and some of the mail headers end up in the body as well:
> 
> 
> =?UTF-8?Q?\x75\x73\x65\x64_\x75\x6E\x69\x6E\x69\x74\x69\x61\x6C\x69\x7A\x65\x64?=

That might be related to the whitespace continuation (the first line
after the break is the second line of the subject).

-Peff