Thanks a lot for the code review. Any comments on the big picture or design?
Frank Lichtenheld wrote: > On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote: > > I have a sourcev3 branch with my changes at <git://kitenet.net/dpkg>, > > and have also attached a diff to this mail. I feel that this is ready > > for review and hopefully merging into dpkg now. Looking forward to your > > comments. > > A little code review follows. > > > +# You should have received a copy of the GNU General Public License > > +# along with this program; if not, write to the Free Software > > +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA > > old FSF address (not really important, but while we're at it ;) Copied from elsewhere in dpkg source. :-) > > +sub sanity_check { > > + my $srcdir=shift; > > + > > + if (! -s "$srcdir/.git") { > > + main::error(sprintf(_g("%s is not the top directory of a git > > repository (%s/.git not present), but Format git was specified"), $srcdir, > > $srcdir)); > > you probably mean -e or -d here? -s on a directory is kinda strange. > printing $srcdir twice might bloat the error message. Yes, I meant -d, the -s snuck in from the other test. ACK on the duplication. > > + } > > + if (-s "$srcdir/.gitmodules") { > > + main::error(sprintf(_g("git repository %s uses submodules. This > > is not yet supported."), $srcdir)); > > + } > > + > > + # Symlinks from .git to outside could cause unpack failures, or > > + # point to files they shouldn't, so check for and don't allow. > > + if (-l "$srcdir/.git") { > > + main::error(sprintf(_g("%s is a symlink"), "$srcdir/.git")); > > + } > > + my $abs_srcdir=Cwd::abs_path($srcdir); > > + find(sub { > > + if (-l $_) { > > + if (Cwd::abs_path(readlink($_)) !~ > > /^\Q$abs_srcdir\E(\/|$)/) { > > + main::error(sprintf(_g("%s is a symlink to > > outside %s"), $File::Find::name, $srcdir)); > > + } > > + } > > + }, "$srcdir/.git"); > > Maybe it would be easier to just disallow symlinks completly? Or are > there important use cases for that? I've tried to not make dpkg have to know too much about git internals. (As you can see I've not been 100% successful, but have kept it to about the level someone with a week's knowledge of git would be comfortable with.) So while I don't see any symlinks in my git repos, if git decides to use symlinks, I don't want dpkg to have to be updated. (I think git did historically use symlinks in the repo). There are probably semi-valid reasons to manually add symlinks inside a .git directory today, too. > > +} > > + > > +# Called before a tarball is created, to prepare the tar directory. > > +sub prep_tar { > > + my $srcdir=shift; > > + my $tardir=shift; > > + > > + sanity_check($srcdir); > > + > > + if (! -e "$srcdir/.git") { > > + main::error(sprintf(_g("%s is not a git repository, but Format > > git was specified"), $srcdir)); > > + } > > + if (-e "$srcdir/.gitmodules") { > > + main::error(sprintf(_g("git repository %s uses submodules. This > > is not yet supported."), $srcdir)); > > + } > > Duplicated code from sanity_check Doh! > > + > > + # Check for uncommitted files. > > + open(GIT_STATUS, "LANG=C cd $srcdir && git-status |") || > > + main::subprocerr("cd $srcdir && git-status"); > > you make a lot "cd $srcdir". Maybe you should just chdir() in the parent > process? I could make it do that, I suppose it would be safe as long as I cd back (dpkg-source in general assumes it's in the parent dir of the source tree). > This would also take care of funny things in $srcdir like > whitespaces... > If you get rid of the cd you could use the '-|', @array form of open > here which would be preferable imho. Wow, you've taught me something new, I only knew about the much more clumsy manual fork and open("-|") approach. I'll do this, but it will take a little while. > > + my $clean=0; > > + my $status=""; > > + while (<GIT_STATUS>) { > > + if (/^\Qnothing to commit (working directory clean)\E$/) { > > + $clean=1; > > + } > > + else { > > + $status.="git-status: $_"; > > + } > > + } > > + close GIT_STATUS; > > + # git-status exits 1 if there are uncommitted changes or if > > + # the repo is clean, and 0 if there are uncommitted changes > > + # listed in the index. > > + if ($? && $? >> 8 != 1) { > > + main::subprocerr("cd $srcdir && git status"); > > + } > > + if (! $clean) { > > + # To support dpkg-buildpackage -i, get a list of files > > dpkg-source -i would be the proper attribution here. dpkg-buildpackage > implements -i only as a pass-through option. True. > > + # eqivilant to the ones git-status finds, and remove any > > is that an English word? Even better, a common typo of one. :-) > > + # ignored files from it. > > + my @ignores="--exclude-per-directory=.gitignore"; > > + my $core_excludesfile=`cd $srcdir && git-config --get > > core.excludesfile`; > > + chomp $core_excludesfile; > > + if (length $core_excludesfile && -e > > "$srcdir/$core_excludesfile") { > > + push @ignores, "--exclude-from='$core_excludesfile'"; > > + } > > + if (-e "$srcdir/.git/info/exclude") { > > + push @ignores, "--exclude-from=.git/info/exclude"; > > + } > > + open(GIT_LS_FILES, "cd $srcdir && git-ls-files -m -d -o > > @ignores |") || > > + main::subprocerr("cd $srcdir && git-ls-files"); > > This is essentially running git-status again without the output > beautification... Can't we avoid doing the work twice? The git-status output is not designed to be machine parseable. Conversely, the git-ls-files output is not designed to be human parseable (the command is not part of the "porcelin"). dpkg-source could just list the non-ignored modified files and not output the git-status and leave the user to do that by hand. > Also I would prefer using long options where available. It's not like > anyone has to type them more than once ;) Done. > > + my @files; > > + while (<GIT_LS_FILES>) { > > + chomp; > > + if (! length $main::diff_ignore_regexp || > > + ! m/$main::diff_ignore_regexp/o) { > > + push @files, $_; > > + } > > + } > > + close(GIT_LS_FILES) || main::syserr("git-ls-files exited > > nonzero"); > > + > > + if (@files) { > > + print $status; > > + main::error(sprintf(_g("uncommitted, not-ignored > > changes in working directory: %s"), > > + join(" ", @files))); > > That intendation looks wrong. I assume you mean of the join part? I typically write this one of two ways, I've switched it to the other way. > > + } > > + } > > + > > + # garbage collect the repo > > + system("cd $srcdir && git-gc --prune"); > > + $? && main::subprocerr("cd $srcdir && git-gc --prune"); > > Again, dropping the cd would make it possible to use the @array form. > git gc --prune is also a very rude thing to do implicitly. Maybe this > should better be done in the copy? Agreed, done. > We could be even more rude then and > also delete the reflog which would make git gc way more efficient in > some cases. I had been meaning to look at that but forgot. You think I should run git-reflog expire --expire=<NOW> before git-gc? git-reflog expire --all does not in fact seem to delete all reflog entries. I haven't figured out what to use for <NOW> to make it expire everything. > > + # TODO support for creating a shallow clone for those cases where > > + # uploading the whole repo history is not desired > > + > > + mkdir($tardir,0755) || > > + &syserr(sprintf(_g("unable to create `%s'"), $tardir)); > > + system("cp -a $srcdir/.git $tardir"); > > + $? && main::subprocerr("cp -a $srcdir/.git $tardir"); > > you actually can use the @array form here. Rodger. > > + main::warning(sprintf(_g("executable bit set on %s; > > clearing"), $hook)); > > + chmod(0644 &~ umask(), $hook) || > > + main::syserr(sprintf(_g("unable to change > > permission of `%s'"), $hook)); > > why these very cautios permissions here (i.e. why not 0666)? Your're right, that should be 666 &~umask. > > + # This needs to be an absolute path, for some reason. > > + my $configfile=Cwd::abs_path("$srcdir/.git/config"); > > + open(GIT_CONFIG, "git-config --file $configfile -l |") || > > + main::subprocerr("git-config"); > > you can use the @array form here. And using git-config --null > would preferable here. This was introduced only recently, but since > git-config isn't available in etch's version at all... Ok, I'll write that wnen I have more energy. > > + my @config=<GIT_CONFIG>; > > + close(GIT_CONFIG) || main::syserr("git-config exited nonzero"); > > + my %removed_fields; > > + foreach (@config) { > > + chomp; > > + my ($field, $value)=split(/=/, $_, 2); > > + if ($field !~ /$safe_fields/) { > > + if (! exists $removed_fields{$field}) { > > + system("git-config", "--file", $configfile, > > + "--unset-all", $field); > > + $? && main::subprocerr("git-config --file > > $configfile --unset-all $field"); > > + } > > + push @{$removed_fields{$field}}, $value; > > Have you tested what happens if you try to unset something that > git-config -l listed but that is really from the global or the system > config file? That's why I use --file $configfile, it really doesn't list things from anywhere except the specified file. > > + } > > + } > > + if (%removed_fields) { > > + open(GIT_CONFIG, ">>", $configfile) || > > + main::syserr(sprintf(_g("unstable to append to %s", > > $configfile))); > > + print GIT_CONFIG "\n# "._g("The following setting(s) were > > disabled by dpkg-source").":\n"; > > + foreach my $field (sort keys %removed_fields) { > > + foreach my $value (@{$removed_fields{$field}}) { > > + print GIT_CONFIG "# $field=$value\n"; > > + } > > + } > > + close GIT_CONFIG; > > + main::warning(_g(_g("modifying .git/config to comment out some > > settings"))); > > + } > > nested _g D'oh. > > + > > + # Note that git-reset is used to repopulate the WC with files. > > + # git-clone isn't used because the repo might be an unclonable > > + # shallow copy. git-reset also recreates the index. > > + # XXX git-reset should be made to run in quiet mode here, but > > + # lacks a good way to do it. Bug filed. > > + system("cd $srcdir && git-reset --hard"); > > again a chdir + system(@array) would be better imho. > Why not git checkout -f ? Because I didn't know about it (< 1 week ;-). It does seem to work equivilantly and avoid ugly messages, so I've made the change. > > +sub loadvcs { > > + my $vcs = shift; > > + my $mod = "Dpkg::Source::VCS::$vcs"; > > + eval qq{use $mod}; > > + return ! $@; > > +} > > Since we never ever want to import something from $mod, maybe we should > require it? Well, I can't see any reason to import it right now, so I don't mind making that change. > > - > > + > > Whitespace changes? Fixed. Patch attached with changes from this review (or see my repo). I'll have a second patch later to remove shell exposures, and use git-config --null. -- see shy jo
diff --git a/scripts/Dpkg/Source/VCS/git.pm b/scripts/Dpkg/Source/VCS/git.pm index cac7d05..6eb4894 100644 --- a/scripts/Dpkg/Source/VCS/git.pm +++ b/scripts/Dpkg/Source/VCS/git.pm @@ -40,8 +40,8 @@ delete $ENV{GIT_WORK_TREE}; sub sanity_check { my $srcdir=shift; - if (! -s "$srcdir/.git") { - main::error(sprintf(_g("%s is not the top directory of a git repository (%s/.git not present), but Format git was specified"), $srcdir, $srcdir)); + if (! -d "$srcdir/.git") { + main::error(sprintf(_g("source directory is not the top directory of a git repository (%s/.git not present), but Format git was specified"), $srcdir)); } if (-s "$srcdir/.gitmodules") { main::error(sprintf(_g("git repository %s uses submodules. This is not yet supported."), $srcdir)); @@ -98,8 +98,8 @@ sub prep_tar { main::subprocerr("cd $srcdir && git status"); } if (! $clean) { - # To support dpkg-buildpackage -i, get a list of files - # eqivilant to the ones git-status finds, and remove any + # To support dpkg-source -i, get a list of files + # equivalent to the ones git-status finds, and remove any # ignored files from it. my @ignores="--exclude-per-directory=.gitignore"; my $core_excludesfile=`cd $srcdir && git-config --get core.excludesfile`; @@ -110,7 +110,7 @@ sub prep_tar { if (-e "$srcdir/.git/info/exclude") { push @ignores, "--exclude-from=.git/info/exclude"; } - open(GIT_LS_FILES, "cd $srcdir && git-ls-files -m -d -o @ignores |") || + open(GIT_LS_FILES, "cd $srcdir && git-ls-files --modified --deleted --others @ignores |") || main::subprocerr("cd $srcdir && git-ls-files"); my @files; while (<GIT_LS_FILES>) { @@ -125,22 +125,22 @@ sub prep_tar { if (@files) { print $status; main::error(sprintf(_g("uncommitted, not-ignored changes in working directory: %s"), - join(" ", @files))); + join(" ", @files))); } } - # garbage collect the repo - system("cd $srcdir && git-gc --prune"); - $? && main::subprocerr("cd $srcdir && git-gc --prune"); - # TODO support for creating a shallow clone for those cases where # uploading the whole repo history is not desired mkdir($tardir,0755) || &syserr(sprintf(_g("unable to create `%s'"), $tardir)); - system("cp -a $srcdir/.git $tardir"); + system("cp", "-a", "$srcdir/.git", $tardir); $? && main::subprocerr("cp -a $srcdir/.git $tardir"); + # garbage collect the new repo + system("cd $tardir && git-gc --prune"); + $? && main::subprocerr("cd $tardir && git-gc --prune"); + # As an optimisation, remove the index. It will be recreated by git # reset during unpack. It's probably small, but you never know, this # might save a lot of space. @@ -158,7 +158,7 @@ sub post_unpack_tar { foreach my $hook (glob("$srcdir/.git/hooks/*")) { if (-x $hook) { main::warning(sprintf(_g("executable bit set on %s; clearing"), $hook)); - chmod(0644 &~ umask(), $hook) || + chmod(0666 &~ umask(), $hook) || main::syserr(sprintf(_g("unable to change permission of `%s'"), $hook)); } } @@ -211,16 +211,15 @@ sub post_unpack_tar { } } close GIT_CONFIG; - main::warning(_g(_g("modifying .git/config to comment out some settings"))); + main::warning(_g("modifying .git/config to comment out some settings")); } - # Note that git-reset is used to repopulate the WC with files. - # git-clone isn't used because the repo might be an unclonable - # shallow copy. git-reset also recreates the index. - # XXX git-reset should be made to run in quiet mode here, but - # lacks a good way to do it. Bug filed. - system("cd $srcdir && git-reset --hard"); - $? && main::subprocerr("cd $srcdir && git-reset --hard"); + # git-checkout is used to repopulate the WC with files + # and recreate the index. + # (git-clone isn't used because the repo might be an unclonable + # shallow copy.) + system("cd $srcdir && git-clone -f"); + $? && main::subprocerr("cd $srcdir && git-clone -f"); } 1 diff --git a/scripts/dpkg-source.pl b/scripts/dpkg-source.pl index 6c823c8..6774814 100755 --- a/scripts/dpkg-source.pl +++ b/scripts/dpkg-source.pl @@ -193,7 +193,7 @@ sub handleformat { sub loadvcs { my $vcs = shift; my $mod = "Dpkg::Source::VCS::$vcs"; - eval qq{use $mod}; + eval qq{require $mod}; return ! $@; } @@ -535,7 +535,7 @@ if ($opmode eq 'build') { &syserr(sprintf(_g("unable to rename `%s' (newly created) to `%s'"), $newtar, $tarname)); chmod(0666 &~ umask(), $tarname) || &syserr(sprintf(_g("unable to change permission of `%s'"), $tarname)); - + } else { printf(_g("%s: building %s using existing %s")."\n",
signature.asc
Description: Digital signature