Jim Meyering wrote: > Bruno Haible wrote: >> What should I write in the NEWS file, about recommendations for people who >> have >> patches on top of gnulib? > > We also need a way to keep things in order going forward. > I.e., a syntax-check style rule that enforces this style. > > To that end, please prepare a file like the one below, > to be committed along with your other changes, > or as part of a subsequent change that enforces policy. > I started based on your earlier outline. > > These are extended regular expressions that match > any file that must retain TAB-based indentation. > For now, let's not worry about TABs elsewhere. > -------------------------- > # These contain Makefile snippets. > ^modules/ > > # The regex module is the only major source code for which we still > # have bidirectional propagation between gnulib and glibc. > ^lib/regcomp\.c$ > ^lib/regex\.[ch]$ > ^lib/regex_internal\.[ch]$ > ^lib/regexec\.c$ > > # This is special. > ^lib/.*\.charset$ > > # This is a binary file. > ^lib/.*\.class$ > -------------------------- > >> What are the tricks? > > I'll try to post details tomorrow.
The first part is the "patch-xform" script below. I'll put it in gnulib's build-aux soon. For example, I've just used it in coreutils-with-latest-gnulib to confirm that it can transform the two gl/lib/*.diff files that no longer apply: cd coreutils/gl/lib && for i in c h; do f=tempname.$i.diff; patch-xform $f > k && mv k $f; done
#!/usr/bin/perl # Expand leading TABs in the context and modified lines of git unidiff patches. # If --exclude=FILE is specified, do not modify the patches of any file whose # name matches any of the perl regular expressions (one per line) in that file. # The regular expressions are matched against each full, relative file name, as # found in git unidiff headers, but without the typical "a/", "b/", etc. prefix. # Here is a useful set of regular expressions: # # (?:^|\/)ChangeLog[^/]*$ # (?:^|\/)(?:GNU)?[Mm]akefile[^/]*$ # \.(?:am|mk)$ # # Only lines to consider: # # /^[ +-]/ matched and context lines, when in a diff # # /^diff --git/ this is a git diff: ignore a/ and b/ file name prefix # /^--- (.*)/ use the file name in $1 # /^\+\+\+ / ignore # # Currently makes no attempt to detect the end of the final patch, # so it may convert TABs to spaces on anything there that resembles # a unidiff-context/modified line. use strict; use warnings; use Text::Tabs; use Getopt::Long; (my $ME = $0) =~ s|.*/||; my $VERSION = '0.1'; my $verbose; sub usage ($) { my ($exit_code) = @_; my $STREAM = ($exit_code == 0 ? *STDOUT : *STDERR); if ($exit_code != 0) { print $STREAM "Try `$ME --help' for more information.\n"; } else { my $example_regexp = <<\EOF; (?:^|\/)ChangeLog[^/]*$ (?:^|\/)(?:GNU)?[Mm]akefile[^/]*$ \.(?:am|mk)$ EOF print $STREAM <<EOF; Usage: $ME [OPTIONS] [FILE] Filter FILE (containing git unidiff output), expanding leading TABs in the context and modified lines. OPTIONS: --exclude=RE_FILE if RE_FILE is specified, do not modify the patches of any file whose name matches any of the perl regular expressions (one per line) in that file. --help display this help and exit --version output version information and exit With no FILE, or when FILE is -, read standard input. Sample content for a RE_FILE: $example_regexp Be sure to exclude any binary files, e.g., .jpg, .pdf, etc. too. EOF } exit $exit_code; } sub build_regexp ($) { my ($file) = @_; # Read regexps from $file, one per line, then 'OR'ing them together # and wrap in (?:...) to form our result. open IN, '<', $file or die "$ME: $file: cannot open for reading: $!\n"; my @lines = <IN>; close IN; chomp @lines; my $re = join '|', @lines; return "(?:$re)"; } { my $exclude_regexp_file; GetOptions ( 'exclude=s' => \$exclude_regexp_file, help => sub { usage 0 }, verbose => \$verbose, version => sub { print "$ME version $VERSION\n"; exit }, ) or usage 1; my $exempt_file_re; defined $exclude_regexp_file and $exempt_file_re = build_regexp $exclude_regexp_file; my $xform_tabs; while (defined (my $line = <>)) { my $xformed; if ($line =~ /^--- [a-z]\/(.*)/) # use the file name in $1 { my $file_name = $1; $xform_tabs = (defined $exempt_file_re ? $file_name !~ /$exempt_file_re/o : 1); $verbose and warn "info: $file_name: " . ($xform_tabs ? 1 : 0) . "\n"; } elsif ($line =~ /^(?:\...@\@[ ] |(copy|rename)[ ] |[ ]\d{6}$ |diff[ ]--git[ ] |index[ ] ) /x) { # ignore } elsif ($line =~ /^(?:$|[ +-])/) { $verbose and warn "info: $.\n"; # Process or not, depending on name. if ($xform_tabs) { $verbose and warn "info: $line\n"; my $match = $line =~ /^([ +-])( *\t[ \t]*)(.*)/; print $match ? $1 . expand($2) . $3 . "\n" : $line; $xformed = 1; $verbose && $match and warn "info: MATCHED!\n"; } } else { # warn "$ME: unrecognized line: $line\n"; $xform_tabs = 0; } ! $xformed and print $line; } } END { # use File::Coda; # http://meyering.net/code/Coda/ defined fileno STDOUT or return; close STDOUT and return; warn "$ME: failed to close standard output: $!\n"; $? ||= 1; } # Local variables: # indent-tabs-mode: nil # End:
You can do the same thing to a topic branch in git. Here is pseudo-texinfo: Let's assume that just after transforming @samp{master}, you tagged the result with @samp{tab} and the changes you want to rebase are on the @samp{topic} branch. With that, you would run these commands to rebase that branch: @example git checkout topic [1] git rebase tab^ [2] git format-patch --stdout master \ | patch-xform --exclude=leading-blank.exempt \ > topic.xformed [3] git checkout -b topic2 tab [4] git am topic.xformed [5] git diff --ignore-space-change topic topic2 [6] git branch -D topic [7] git branch -m topic2 topic [8] git rebase master [9] @end example Step 1 ensures that @samp{topic} is the current branch, which [2] rebases to @samp{tab^}, the change-set just before the problematic one. The third step prints the patch series on @samp{topic}, filters it through our patch-transforming script and saves the result in a temporary file. Step 4 creates and makes current our temporary branch, @samp{topic2}, with its base at @samp{tab}, and [5] then applies the transformed patch set to that new branch. [6] is an optional cross-check to ensure that the only differences between the two branches are safely ignorable. Steps 7 and 8 clean up by removing the original @samp{topic} branch and replacing it with the temporary one. Finally, step 9 rebases our new branch to @samp{master}. We can perform the same task more efficiently and concisely, with the advantage of no temporary file, but perhaps at the expense of readability, depending on your familiarity with these @command{git} commands. You be the judge: @example git rebase tab^ topic [a] git checkout -b topic2 tab [b] git format-patch --stdout master..topic \ | patch-xform --exclude=leading-blank.exempt \ | git am [c] git diff --ignore-space-change topic topic2 [d] git branch -D topic [e] git branch -m topic2 topic [f] git rebase master [g] @end example Step [a] combines [1] and [2], since there is no need to change the current branch. Since [c]'s use of @samp{git am} will modify the current branch (contrast with [3], which just writes a temporary file), step [b] must first create and switch to the destination branch, @samp{topic2}. Step [c] forms the patch series for everything on the @samp{topic} branch, filters it through our @command{patch-xform} script, and applies the result to the current branch via @command{git am}. The remaining steps are identical to 6...@dots{}9. However, all of the above doesn't qualify as ``easy enough'' for most people. There are too many variables and interdependencies. Note that [a] and [g] may evoke merge conflicts, so they delineate the non-interactive core: [...@dots{}[f]. Even for so few steps, there are four inputs: @itemize @item @var{P} parent branch name [master] @item @var{T} tag marking the transition point on @var{P} [tab] @item @var{B} name of branch to move [topic] (forked off of @var{P} prior to @var{T}) @item file name blacklist: [leading-blank.exempt] @end itemize @c note that the list of branch names from "git br --contains @var{T}" @c must include @var{P} You can also think of the type of transformation as an input: trailing-blank-removal or leading-TAB-to-space, or even both. If you make that the fifth input, verify that @var{T} contains only changes implied by this type. Actually, there's an even better way: automatically derive the type from @var{T}'s change set. If this command prints no changes, then @var{T} is a trailing-blank-removal delta: @example git diff --ignore-space-at-eol T^..T @end example Otherwise, if @var{T}'s delta transforms @kbd{TAB}s to spaces in indentation, this command will print no diffs: @example git diff --ignore-space-change T^..T @end example