Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Anthony Towns
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
> I've been working on making dpkg-source support a new source package format
> based upon git. The idea is that a source package has only a .dsc and a
> .git.tar.gz, which is just a git repo.

Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the
full git repo replaced by each upload? ie, something like

Files:
  foo_1.0-1.git.tar.gz
  foo_1.0-2.gitdiff.tar.gz

so that a small patch only adds a small file to the archive rather than
replacing a large one?

This means you can't build the package by hand with standard unix tools
-- at the very least you need git installed, and if other VC systems
are to be supported, you need them too. Changes in repository formats
will presumably result in versioned dependencies too.

This is slightly worse than the case for existing patch management tools
in that most of those can be dealt with by hand; though cdbs and to a
lesser extent debhelper can't be quite as easily replicated I guess.

Once the unpack is done, I don't see any reason why you can't do an NMU
in the traditional way, so presuming "dpkg-source -x" or "apt-get source"
handles the unpack automatically, I don't think it necessarily imposes
any new requirements on NMUers.

Maybe providing a feature on packages.debian.org (or similar) to download
sources in simple, non-VC, tarball format would make this a complete
non-issue though?

Would it make sense to have the source format look more like:

Format: 3.0
Source: dpkg
...
Source-Depends: git-dpkg (>= 3.14159)
Source-Hooks: /usr/bin/git-dpkg
...
Files:
 ... foo_1.2.git.tar.gz

and have the git specific functionality be provided by a /usr/bin/git-dpkg
binary (with standardised arguments) from the git-dpkg package? That
would let you smoothly deal with repository changes and implementing new
interfaces, and also let us limit the allowable formats for the archive
reasonably simply.

You could drop the Source-Hooks: line, and just have dpkg-source know
to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the
package will provide it.

Bonus points: rather than "debian/rules clean, create a diff, build",
have dpkg do "debian/rules clean, commit any uncommitted changes with the
commit message being the changes from the changelog, create a .git.tgz,
build" for git-source-format packages.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Anthony Towns wrote:
> Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the
> full git repo replaced by each upload? ie, something like
> 
>   Files:
> foo_1.0-1.git.tar.gz
> foo_1.0-2.gitdiff.tar.gz
> 
> so that a small patch only adds a small file to the archive rather than
> replacing a large one?

I think it's possible, the gitdiff might use git packs against a prior
repo. That would be a nice enhancement to what I have done.

> This means you can't build the package by hand with standard unix tools
> -- at the very least you need git installed, and if other VC systems
> are to be supported, you need them too.

Yes, as I mention in the faq I think this is an acceptable tradeoff to
get away from having to use diff.

> Changes in repository formats will presumably result in versioned
> dependencies too.

I don't think that dpkg should add vcs formats that we don't have a good
expectation of remaining supported by newer versions of the tools going
forward (so svn repos are out). There's a bit of discussion of this in
the faq. I think that git has a pretty good track record and has
incentive to keep compatibility support since this format is
used over the wire by git (eg, with http urls).

If the format changes in a non-backwards compatible way, we could have
source packages built on unstable that cannot be extracted on stable,
which I also think is suboptimal, but hard to completly avoid.

> This is slightly worse than the case for existing patch management tools
> in that most of those can be dealt with by hand; though cdbs and to a
> lesser extent debhelper can't be quite as easily replicated I guess.

Neither could packages using quilt before it was available in
stable or dbs before it was.

> Once the unpack is done, I don't see any reason why you can't do an NMU
> in the traditional way, so presuming "dpkg-source -x" or "apt-get source"
> handles the unpack automatically, I don't think it necessarily imposes
> any new requirements on NMUers.

Basically, you have to know how to git commit your changes before building
the NMU, and that's all. As a bonus, it's rather easier to generate NMU
patchsets. :-)

> Maybe providing a feature on packages.debian.org (or similar) to download
> sources in simple, non-VC, tarball format would make this a complete
> non-issue though?

pristine-tar could be used for this, it would just need source packages
to put the delta somewhere standaised (under debian/), and would need 
some standarised way to get to the upstream source branch in git.

> Would it make sense to have the source format look more like:
> 
>   Format: 3.0
>   Source: dpkg
>   ...
>   Source-Depends: git-dpkg (>= 3.14159)
>   Source-Hooks: /usr/bin/git-dpkg
>   ...
>   Files:
>... foo_1.2.git.tar.gz
> 
> and have the git specific functionality be provided by a /usr/bin/git-dpkg
> binary (with standardised arguments) from the git-dpkg package? That
> would let you smoothly deal with repository changes and implementing new
> interfaces, and also let us limit the allowable formats for the archive
> reasonably simply.
> 
> You could drop the Source-Hooks: line, and just have dpkg-source know
> to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the
> package will provide it.
 
Not sure if this buys anything that using perl modules for the vcses
can't do, really. How do you envision this helping deal with repository
format changes?

> Bonus points: rather than "debian/rules clean, create a diff, build",
> have dpkg do "debian/rules clean, commit any uncommitted changes with the
> commit message being the changes from the changelog, create a .git.tgz,
> build" for git-source-format packages.

I have a feeling that any auto-commit stuff should be controlled by an
option. I'm *sure* that it would annoy some developers. No strong
feelings about whether it should default on or off, though least suprise
suggests *off*.

-- 
see shy jo


signature.asc
Description: Digital signature


[PATCH/RFC] deb-version.5: Add an own manpage for Dpkg's version format

2007-10-06 Thread Frank Lichtenheld
I was looking for a way to really close #373003 (dpkg-dev: deb-control.5
old rule for Version hyphenation). In the end it lead me to the
conclusion that Dpkg should contain a full description of its Version
format, which in turn lead to completly copying the section from Policy
since this is probably the best description available. This of course
again leads to a few followup questions:

1) If I would copy this text, who to credit for it? For now I just
copied the copyright notice from Policy but I suspect that might not be
the whole truth given how old it is.
2) Should we really try to include more documentation of dpkg's
behaviour in dpkg itself? (My answer is a clear "yes" to that)
If yes, how do we avoid duplication with policy? After all we probably
can't just delete such stuff from policy since there might be
differences what dpkg supports and what policy allows. But not
documenting dpkg features until they are allowed by Policy is not
a good way either.
3) What do people think of this specific case of copying?
Worth it? Should I try to condense the information more?
What should we do with the text is policy?

Comments welcome.

Gruesse,
Frank

---
 debian/changelog  |2 +
 man/ChangeLog |9 
 man/deb-control.5 |7 ++-
 man/deb-version.5 |  124 +
 4 files changed, 139 insertions(+), 3 deletions(-)
 create mode 100644 man/deb-version.5

diff --git a/debian/changelog b/debian/changelog
index 6c33f1c..9facda3 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -35,6 +35,8 @@ dpkg (1.14.7) UNRELEASED; urgency=low
 Closes: #379418
   * Let dpkg-buildpackage error out early if the version number from
 the changelog is not a valid Debian version. Closes: #216075
+  * Add an own manpage for Dpkg's version format. Mostly stolen
+from policy. Closes: #373003
 
   [ Updated dpkg translations ]
   * Basque (Piarres Beobide). Closes: #440859
diff --git a/man/ChangeLog b/man/ChangeLog
index fcc2e1a..42bdc37 100644
--- a/man/ChangeLog
+++ b/man/ChangeLog
@@ -1,3 +1,12 @@
+2007-10-06  Frank Lichtenheld  <[EMAIL PROTECTED]>
+
+   * deb-control.5: Move description of
+   version format to...
+   * deb-version.5: Take the section from
+   policy describing version format and
+   sorting since this is probably as good
+   as it gets for describing these.
+
 2007-09-30  Frank Lichtenheld  <[EMAIL PROTECTED]>
 
* deb-control.5: Remove obsolete sentence regarding
diff --git a/man/deb-control.5 b/man/deb-control.5
index efc40c7..7043ef6 100644
--- a/man/deb-control.5
+++ b/man/deb-control.5
@@ -31,9 +31,9 @@ generate file names by most installation tools.
 .BR Version: " "
 Typically, this is the original package's version number in whatever form
 the program's author uses. It may also include a Debian revision number
-(for non-native packages). If both version and revision are supplied,
-they are separated by a hyphen, `-'. For this reason, the original version
-may not have a hyphen in its version number.
+(for non-native packages). The exact format and sorting algorithm
+are described in
+.BR deb-version (5).
 .TP
 .BR Maintainer: " "
 Should be in the format `Joe Bloggs <[EMAIL PROTECTED]>', and is typically
@@ -219,6 +219,7 @@ Description: GNU grep, egrep and fgrep.
 .
 .SH SEE ALSO
 .BR deb (5),
+.BR deb-version (5),
 .BR debtags (1),
 .BR dpkg (1),
 .BR dpkg-deb (1).
diff --git a/man/deb-version.5 b/man/deb-version.5
new file mode 100644
index 000..ea273ec
--- /dev/null
+++ b/man/deb-version.5
@@ -0,0 +1,124 @@
+.\" Author: ??
+.\" Includes text from the Debian Policy by ??
+.TH deb\-version 5 "2007-10-06" "Debian Project" "Debian"
+.SH NAME
+deb\-version \- Debian package version number format
+.
+.SH SYNOPSIS
+.RI "[ " epoch ":] " upstream_version " [\-" debian_revision " ]"
+.SH DESCRIPTION
+Version numbers as used for Debian binary and source packages
+consist of three components. These are:
+.TP
+.I epoch
+This is a single (generally small) unsigned integer.  It
+may be omitted, in which case zero is assumed.  If it is
+omitted then the \fIupstream_version\fP may not
+contain any colons.
+.IP
+It is provided to allow mistakes in the version numbers
+of older versions of a package, and also a package's
+previous version numbering schemes, to be left behind.
+.TP
+.I upstream_version
+This is the main part of the version number.  It is
+usually the version number of the original ("upstream")
+package from which the \fI.deb\fP file has been made,
+if this is applicable.  Usually this will be in the same
+format as that specified by the upstream author(s);
+however, it may need to be reformatted to fit into the
+package management system's format and comparison
+scheme.
+.IP
+The comparison behavior of the package management system
+with respect to the \fIupstream_version\fP is
+described below.  The \fIupstream_version\fP
+portion of the version number is mandatory.
+.IP
+The \fIupstream_

Re: Bug#432893: Accepted dpkg 1.14.7~newshlib (source i386 all)

2007-10-06 Thread Kurt Roeckx
On Fri, Sep 28, 2007 at 05:42:41PM +0100, Ian Jackson wrote:
> Raphael Hertzog writes ("Accepted dpkg 1.14.7~newshlib (source i386
> all)"):
> > [stuff]
> 
> I'm very pleased to see all of this work being done on the Perl
> scripts - I'm hoping for big compatibility improvements from Raphael's
> shared library management changes.
> 
> But I did want to comment on this:
> >* After ' remove' fails and while doing the error unwinding, if
> >  the ' abort-remove' call succeeds, preserve the old status
> >  instead of unconditionally setting it to 'Installed'. Closes: #432893
> >  Thanks to Brian M. Carlson.
> 
> I don't think this change is correct.  If the documentation wasn't
> clear then it should have been clarified.
> 
> If the   abort-remove  is executed and completes
> successfully, the package should be regarded as installed.

I don't agree for reasons I stated before. I have filed a bug
against policy: #443334

I suggest we talk about this on the policy list.


Kurt


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Joey Hess wrote:
> > Maybe providing a feature on packages.debian.org (or similar) to download
> > sources in simple, non-VC, tarball format would make this a complete
> > non-issue though?
> 
> pristine-tar could be used for this, it would just need source packages
> to put the delta somewhere standaised (under debian/), and would need 
> some standarised way to get to the upstream source branch in git.

BTW, if that were standardised, the other option would be for
dpkg-source -x to regenerate the pristine upstream tarball.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Joey Hess wrote:
> > Bonus points: rather than "debian/rules clean, create a diff, build",
> > have dpkg do "debian/rules clean, commit any uncommitted changes with the
> > commit message being the changes from the changelog, create a .git.tgz,
> > build" for git-source-format packages.
> 
> I have a feeling that any auto-commit stuff should be controlled by an
> option. I'm *sure* that it would annoy some developers. No strong
> feelings about whether it should default on or off, though least suprise
> suggests *off*.

One problem with auto-committing is tags. Developers will probably
want to tag their release before doing the final release build, and
if dpkg-source then found and auto-committed a further change, the tag
wouldn't accurately match the release.

-- 
see shy jo


signature.asc
Description: Digital signature


[PATCH/RFC] dpkg-source.pl: Support a subset of wig&pen on build

2007-10-06 Thread Frank Lichtenheld
Use .orig.tar.(bz2|lzma) if they are available
and no .gz can be found. Also let the user specify
via -C(gz|bz2|lzma) how files that need to be
generated should be compressed.

I think this is about the maximum support for wig&pen we can add in
dpkg-source -b without big code changes. But it might be a useful one,
especially for big packages. Such packages could maybe allowed for
lenny (buildds still running sarge might pose a problem, though).

Gruesse,
Frank

---
 ChangeLog  |9 
 debian/changelog   |3 +
 scripts/dpkg-source.pl |  121 ++-
 3 files changed, 89 insertions(+), 44 deletions(-)

diff --git a/ChangeLog b/ChangeLog
index b1eafab..8ad0186 100644
--- a/ChangeLog
+++ b/ChangeLog
@@ -1,3 +1,12 @@
+2007-10-06  Frank Lichtenheld  <[EMAIL PROTECTED]>
+
+   * scripts/dpkg-source.pl: Support a subset of
+   wig&pen (aka Format: 2.0) on build:
+   Use .orig.tar.(bz2|lzma) if they are available
+   and no .gz can be found. Also let the user specify
+   via -C(gz|bz2|lzma) how files that need to be
+   generated should be compressed.
+
 2007-09-29  Frank Lichtenheld  <[EMAIL PROTECTED]>
 
* scripts/dpkg-buildpackage.pl: Call checkversion()
diff --git a/debian/changelog b/debian/changelog
index 6c33f1c..3ea39e0 100644
--- a/debian/changelog
+++ b/debian/changelog
@@ -35,6 +35,9 @@ dpkg (1.14.7) UNRELEASED; urgency=low
 Closes: #379418
   * Let dpkg-buildpackage error out early if the version number from
 the changelog is not a valid Debian version. Closes: #216075
+  * Allow to use other compressions than gzip on dpkg-source -b
+(NOTE: this will result in a Format: 2.0 source package!).
+Closes: #382673
 
   [ Updated dpkg translations ]
   * Basque (Piarres Beobide). Closes: #440859
diff --git a/scripts/dpkg-source.pl b/scripts/dpkg-source.pl
index c036478..e45f560 100755
--- a/scripts/dpkg-source.pl
+++ b/scripts/dpkg-source.pl
@@ -75,6 +75,10 @@ my $max_dscformat = 2;
 my $def_dscformat = "1.0"; # default format for -b
 
 my $expectprefix;
+my $compression = 'gz';
+my @comp_supported = qw(gz bz2 lzma);
+my %comp_supported = map { $_ => 1 } @comp_supported;
+my $comp_regex = '(?:gz|bz2|lzma)';
 
 # Packages
 my %remove;
@@ -171,6 +175,8 @@ Build options:
   -ss  trust packed & unpacked orig src are same.
   -sn  there is no diff, do main tarfile only.
   -sA,-sK,-sP,-sU,-sR  like -sa,-sk,-sp,-su,-sr but may overwrite.
+  -C  select compression to use (defaults to 'gz',
+ supported are: %s).
 
 Extract options:
   -sp (default)leave orig source packed in current dir.
@@ -182,7 +188,8 @@ General options:
   --versionshow the version.
 "), $progname,
 $diff_ignore_default_regexp,
-join('', map { " -I$_" } @tar_ignore_default_pattern);
+join('', map { " -I$_" } @tar_ignore_default_pattern),
+"@comp_supported" ;
 }
 
 sub handleformat {
@@ -201,6 +208,10 @@ while (@ARGV && $ARGV[0] =~ m/^-/) {
 &setopmode('build');
 } elsif (m/^-x$/) {
 &setopmode('extract');
+} elsif (m/^-C/) {
+   $compression = $POSTMATCH;
+   usageerr(sprintf(_g("%s is not a supported compression"), $compression))
+   unless $comp_supported{$compression};
 } elsif (m/^-s([akpursnAKPUR])$/) {
warning(sprintf(_g("-s%s option overrides earlier -s%s option"), $1, 
$sourcestyle))
if $sourcestyle ne 'X';
@@ -269,7 +280,7 @@ if ($opmode eq 'build') {
 
 parsechangelog($changelogfile, $changelogformat);
 parsecontrolfile($controlfile);
-$f{"Format"}=$def_dscformat;
+$f{"Format"}= $compression eq 'gz' ? $def_dscformat : '2.0';
 &init_substvars;
 
 my @sourcearch;
@@ -381,7 +392,7 @@ if ($opmode eq 'build') {
 $basedirname =~ s/_/-/;
 
 my $origdir = "$dir.orig";
-my $origtargz = "$basename.orig.tar.gz";
+my $origtargz;
 if (@ARGV) {
 my $origarg = shift(@ARGV);
 if (length($origarg)) {
@@ -392,7 +403,7 @@ if ($opmode eq 'build') {
 $sourcestyle =~ y/aA/rR/;
 $sourcestyle =~ m/[ursURS]/ ||
 &error(sprintf(_g("orig argument is unpacked but source 
handling style".
-   " -s%s calls for packed (.orig.tar.gz)"), 
$sourcestyle));
+   " -s%s calls for packed (.orig.tar.)"), 
$sourcestyle));
 } elsif (-f _) {
 $origtargz= $origarg;
 $sourcestyle =~ y/aA/pP/;
@@ -408,22 +419,28 @@ if ($opmode eq 'build') {
 &error(sprintf(_g("orig argument is empty (means no orig, no 
diff)".
" but source handling style -s%s wants something"), 
$sourcestyle));
 }
-}
-
-if ($sourcestyle =~ m/[aA]/) {
-if (stat("$origtargz")) {
--f _ || &error(sprintf(_g("packed orig `%s' exists but is no

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Frank Lichtenheld
On Sat, Oct 06, 2007 at 05:27:04PM +1000, Anthony Towns wrote:
> On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
> This means you can't build the package by hand with standard unix tools
> -- at the very least you need git installed, and if other VC systems
> are to be supported, you need them too. Changes in repository formats
> will presumably result in versioned dependencies too.
> 
> This is slightly worse than the case for existing patch management tools
> in that most of those can be dealt with by hand; though cdbs and to a
> lesser extent debhelper can't be quite as easily replicated I guess.

A similar problem arises with Format: 2.0 packages as well if the user
hasn't bzip2 (unlikely) or lzma (likely) installed and tries to unpack
a source package built with them.

Gruesse,
-- 
Frank Lichtenheld <[EMAIL PROTECTED]>
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Frank Lichtenheld
On Sat, Oct 06, 2007 at 11:19:43AM -0400, Joey Hess wrote:
> Anthony Towns wrote:
> > Is a .gitdiff.tar.gz possible, so the archive doesn't need to have the
> > full git repo replaced by each upload? ie, something like
> > 
> > Files:
> >   foo_1.0-1.git.tar.gz
> >   foo_1.0-2.gitdiff.tar.gz
> > 
> > so that a small patch only adds a small file to the archive rather than
> > replacing a large one?
> 
> I think it's possible, the gitdiff might use git packs against a prior
> repo. That would be a nice enhancement to what I have done.

I think there is a mechanism in git to disallow replacing old pack
files (i.e. forcing to create additional ones with only new objects),
however, I haven't used that myself, yet.

On a general note: I think we definetly could need the better tarball
compression support _before_ adding huge amount of history into the
archive...

Gruesse,
-- 
Frank Lichtenheld <[EMAIL PROTECTED]>
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Colin Watson
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
> I've been working on making dpkg-source support a new source package
> format based upon git. The idea is that a source package has only a
> .dsc and a .git.tar.gz, which is just a git repo.

So, I can't stand git's user interface. I generally try to avoid making
a huge issue of this since it seems to be massively political on places
like Planet at the moment, there seems to be a certain amount of
confusion of people's personal opinions with that of their employers
going on, and in any case I normally find that revision control
flamewars have negative utility. (I don't think it's terribly relevant
to this discussion why I prefer not to use git, and I don't want to
sidetrack the thread with that; I just wanted to present an existence
case of somebody who doesn't want to switch to .git.tar.gz and yet
doesn't want to stay with .orig.tar.gz and .diff.gz forever.)

Still, this work looks pretty cool, and I'd like to be able to make use
of it despite avoiding git whenever I can. I noticed that you'd
helpfully structured your changes such that it would be possible to plug
in a different revision control system, so I wrote a module to support
bzr. The patch is attached to this e-mail, and I'd appreciate comments;
if this work is merged into dpkg I'd be very happy if my addition were
merged too. There are probably some improvements to be made, but it was
really utterly trivial; I was impressed that I didn't have to touch
anything else beyond plugging in a new module. Ironically, of course, I
did use git to create it. :-)


While working on this I was thinking about general issues with the
format. It seems to me that it's suboptimal not to ship a working tree.
I know you sort of address this in the wiki FAQ, and I realise that
there are space advantages to only shipping the VCS data. However, I'd
like to try to persuade you otherwise if I can. My concerns are:

  * Users will need to have the VCS installed in order to inspect the
source.

It's true that this is no worse than dbs or dpatch or whatever, and
in fact it's better because dpkg-source will take care of the
unpacking step automatically. Still, I do think it is a downside; we
do still ship /usr/share/doc/debian/source-unpack.txt, and people do
unpack Debian source packages on other systems from time to time and
inspect them (I certainly do the same in the other direction with
source RPMs, and curse their complexity). Plus, if the VCS fails to
reconstitute a working tree for some unforeseen reason (maybe you
have a broken installation of it, or maybe there was some version
skew, or something else), then you're rather screwed. Tarballs are
nice and simple and, assuming they were transferred accurately,
hardly ever break in ways that make it impossible for you to extract
the files.

  * Buildds will need to have the VCS installed in their base system.

Possibly a minor concern since sbuild does the unpack in the base
rather than in the chroot, but it's there nevertheless. Every
derivative distribution that runs its own buildds will need to take
care of this too.

  * Some source packages want to ship non-VCS-managed files.

It's very common for source packages to include autogenerated
objects like configure, Makefile.in, etc. Whether to check these
into a VCS is a somewhat religious matter (as acknowledged by the
gettext info documentation, for instance), and personally I lean
towards checking them in (with a few exceptions) just because it
makes it easier to see when they change and keep an eye out for
oddities, but I know that a lot of developers prefer to keep these
outside their VCS. Shipping a working tree would make it easier to
handle cases like this.

There are two obvious modifications to Joey's proposal that would allow
shipping a working tree. The first is just to include the working tree
in the .$VCS.tar.gz object. This has the advantage of being trivial to
implement on top of the current code: the git module would need to do a
'git checkout' after copying the .git, and the bzr module just wouldn't
call 'bzr remove-tree'.

The second possibility seems to me to be more flexible, though, and
probably not all that hard to implement: build both a .tar.gz
(containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
-x' to unpack the tree given at least one of these. This would allow
various interesting possibilities such as:

  * Buildds could just fetch the .tar.gz; they have no need of the VCS
data. Users who just want to inspect the current version of the
source and not change it might want to do this too, using (say)
'apt-get source --no-vcs package'.

  * Developers on slow connections could say 'apt-get source --vcs-only
package' to fetch just the .$VCS.tar.gz, with the documented caveat
that it would be just like checking the source out of a VCS in t

Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Colin Watson
On Sat, Oct 06, 2007 at 11:17:58PM +0200, Frank Lichtenheld wrote:
> On Sat, Oct 06, 2007 at 05:27:04PM +1000, Anthony Towns wrote:
> > This means you can't build the package by hand with standard unix tools
> > -- at the very least you need git installed, and if other VC systems
> > are to be supported, you need them too. Changes in repository formats
> > will presumably result in versioned dependencies too.
> > 
> > This is slightly worse than the case for existing patch management tools
> > in that most of those can be dealt with by hand; though cdbs and to a
> > lesser extent debhelper can't be quite as easily replicated I guess.
> 
> A similar problem arises with Format: 2.0 packages as well if the user
> hasn't bzip2 (unlikely) or lzma (likely) installed and tries to unpack
> a source package built with them.

Perhaps 'apt-get source' et al could notice this class of situation and
offer to install the necessary unpacking tools for you. It'd have to
rely on sudo or similar as 'apt-get source' is typically run as
non-root, but it seems like a useful enhancement even so.

-- 
Colin Watson   [EMAIL PROTECTED]


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Frank Lichtenheld
On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote:
> On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
> > I've been working on making dpkg-source support a new source package
> > format based upon git. The idea is that a source package has only a
> > .dsc and a .git.tar.gz, which is just a git repo.
[...]
> Still, this work looks pretty cool, and I'd like to be able to make use
> of it despite avoiding git whenever I can. I noticed that you'd
> helpfully structured your changes such that it would be possible to plug
> in a different revision control system, so I wrote a module to support
> bzr. The patch is attached to this e-mail, and I'd appreciate comments;
> if this work is merged into dpkg I'd be very happy if my addition were
> merged too. There are probably some improvements to be made, but it was
> really utterly trivial; I was impressed that I didn't have to touch
> anything else beyond plugging in a new module. Ironically, of course, I
> did use git to create it. :-)

I guess if we use Joey's idea at all we will not be able to avoid
shipping such a module for each distributed VCS, and I didn't get
the impression that Joey thought otherwise. So I find your mail
strangely defensive :)

The code itself looks good AFAICT.

> While working on this I was thinking about general issues with the
> format. It seems to me that it's suboptimal not to ship a working tree.
> I know you sort of address this in the wiki FAQ, and I realise that
> there are space advantages to only shipping the VCS data. However, I'd
> like to try to persuade you otherwise if I can. My concerns are:

Shipping the worktree essentially means defining this new format as
an optional add-on, since you ship all the data you ship now plus some
VCS metadata. So all packages will have to be bigger than there
are now (aside from using other compression methods than gzip, and
after really building some packages today with my dpkg-source -C patch
I have to say I'm impressed how much space we might be able to save -
with high CPU costs, though). This is not really an argument for either
side, just wanted to make this effect clean.

>   * Users will need to have the VCS installed in order to inspect the
> source.
[...]
>   * Buildds will need to have the VCS installed in their base system.
[...]
>   * Some source packages want to ship non-VCS-managed files.
[...]

Is the last one really such a big problem in Debian? I know that many upstream
VCS don't contain autogenerated files but most .orig.tar.gz's already
contain them today, so I would have guessed people either only have
their debian/ in their Debian VCS or all upstream files from the
.orig.tar.gz.

> There are two obvious modifications to Joey's proposal that would allow
> shipping a working tree. The first is just to include the working tree
> in the .$VCS.tar.gz object. This has the advantage of being trivial to
> implement on top of the current code: the git module would need to do a
> 'git checkout' after copying the .git, and the bzr module just wouldn't
> call 'bzr remove-tree'.

This would be a bad idea IMHO, and like a regression: instead of
shipping a .orig.tar+diff we now ship one, monolithic (bigger) tarball?
Sounds suboptimal. I'm pretty sure I don't want to see this one
implemented in dpkg-dev.

> The second possibility seems to me to be more flexible, though, and
> probably not all that hard to implement: build both a .tar.gz
> (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
> -x' to unpack the tree given at least one of these. This would allow
> various interesting possibilities such as:

Since you're essentially demoting the new format to an add-on, why not
just make it really one and just ship a real Format: 1.0 package
(i.e. orig-tar+diff or native-tar) instead of this 
half-half-working-tree-tarball.

[...]
> These seem to me to be non-trivial advantages that outweigh the space
> costs of shipping around the working tree. I'd be willing to have a go
> at implementing this once I've had a bit more sleep.
> 
> Does any of this make sense?

I guess there are two aspects to Joey's proposal:

1) Make the source package more useful by including VCS metadata like
   history

2) Make is easier to include arbitrary changes to the upstream sources
   by using more advanced tools than diff/patch, i.e. a DVCS

By concentrating on the first point and making it optional you either have
to sacrifice point 2 by reusing the old source package (orig+diff) or give
people who choose not to download the vcs data a worse experience by
making it harder for them to find the actual diff (working tree tar).

On second thought you can reduce the regression by adding a pristine-gz
delta to the working tree so that you can split the working tree tarball
back into a orig+diff.

On third thought who says you have to fall back to Format 1.0 for the
non-VCS data? You could also fall back to Format 2.0 which would make
preserving advantage 2 easier.

So

Re: [PATCH/RFC] dpkg-source.pl: Support a subset of wig&pen on build

2007-10-06 Thread Frank Lichtenheld
On Sat, Oct 06, 2007 at 11:09:21PM +0200, Frank Lichtenheld wrote:
> Use .orig.tar.(bz2|lzma) if they are available
> and no .gz can be found. Also let the user specify
> via -C(gz|bz2|lzma) how files that need to be
> generated should be compressed.

Hmm, I just noticed that dpkg-genchanges already uses -C, so it would be
difficult to pass this option down from dpkg-buildpackage. Anyway, the
name of the option is not really important at this point, yet.
(-z perhaps? Seems to be free)

Gruesse,
-- 
Frank Lichtenheld <[EMAIL PROTECTED]>
www: http://www.djpig.de/


-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]



Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Frank Lichtenheld wrote:
> I think there is a mechanism in git to disallow replacing old pack
> files (i.e. forcing to create additional ones with only new objects),
> however, I haven't used that myself, yet.

The packs in the diff package would be basically the same packs that
git-send-pack generates when git is pushing objects to a remote
repository. Where the "remote" repo would be the contents of
foo_1.0-1.git.gz, and the "local" repo would be foo-1.0-2. Intercept
those packs in transit (how?), and then you can take the 1.0-1 repo
and later apply them to it to regenerate the 1.0-2 repo.

> On a general note: I think we definetly could need the better tarball
> compression support _before_ adding huge amount of history into the
> archive...

This would mostly be an optimisation for upload size, total archive size
is only affected if foo 1.0-1 is in testing and 1.0-2 in unstable.

It's actually much more significant to both upload and total archive
size that all 61mb of dpkg's .git not be put into its .git.tar.gz. Thus
the shallow clones with only a few hundred repos or so.

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Colin Watson wrote:
> So, I can't stand git's user interface. I generally try to avoid making
> a huge issue of this since it seems to be massively political on places
> like Planet at the moment, there seems to be a certain amount of
> confusion of people's personal opinions with that of their employers
> going on, and in any case I normally find that revision control
> flamewars have negative utility. (I don't think it's terribly relevant
> to this discussion why I prefer not to use git, and I don't want to
> sidetrack the thread with that; I just wanted to present an existence
> case of somebody who doesn't want to switch to .git.tar.gz and yet
> doesn't want to stay with .orig.tar.gz and .diff.gz forever.)

(So, FWIW, I'm not sold on git. Not sold at all yet. But it was a good
choice for this implementation for several reasons.)

> Still, this work looks pretty cool, and I'd like to be able to make use
> of it despite avoiding git whenever I can. I noticed that you'd
> helpfully structured your changes such that it would be possible to plug
> in a different revision control system, so I wrote a module to support
> bzr.

Nice. The FAQ has some questions aimed at adding other revision control
systems, could you try to answer those in the context of bzr? In
particular, is the data that would be shipped in the source package the
same data that bzr normally reads from untrusted sources, thus ensuring
that using it this way is equally (in)secure as using bzr to pull data
over the network? (Note that this wasn't 100% true for git and I have
had to put in several workarounds.) And is the data format stable and/or
one that bzr has a history of supporting old versions of in a way that
ensures backwards compatability?

Also, will the bzr repos always contain the full history, or is there
an equivilant to git shallow clones? How big do they tend to be?

> It's true that this is no worse than dbs or dpatch or whatever, and
> in fact it's better because dpkg-source will take care of the
> unpacking step automatically. Still, I do think it is a downside; we
> do still ship /usr/share/doc/debian/source-unpack.txt

BTW, source-unpack.txt fails for both packages containing
debian/subdirs/ and of course for wig-n-pen..

>   * Buildds will need to have the VCS installed in their base system.

This seems easily solved by recommends (installed by default).

>   * Some source packages want to ship non-VCS-managed files.
> 
> It's very common for source packages to include autogenerated
> objects like configure, Makefile.in, etc. Whether to check these
> into a VCS is a somewhat religious matter (as acknowledged by the
> gettext info documentation, for instance), and personally I lean
> towards checking them in (with a few exceptions) just because it
> makes it easier to see when they change and keep an eye out for
> oddities, but I know that a lot of developers prefer to keep these
> outside their VCS. Shipping a working tree would make it easier to
> handle cases like this.

Hmm, I hadn't considered that this might be a problem.

I don't know if I'd want to write the code to do this, but shipping a
partial working tree consisting of just those files would be enough to
solve this.

>   * Space-constrained mirrors could conceivably exclude the VCS data if
> they had to, though we probably wouldn't encourage this.
> 
> These seem to me to be non-trivial advantages that outweigh the space
> costs of shipping around the working tree.

The space constraints seem pretty hard to me. Specifically, I don't want
to piss the ftpmasters off and get vcs source packages banned from the
archive.. The only saving grace really seems to be that shipping both
vcs and upstream tar will only double the size of the archive once most
everything uses the new format, and the archive will have probably
doubled in size several times over due to other factors before then.


I've eyeballed the code, it looks ok though so close to code I've been
looking at all week that I may be missing trees for the forest. :-)

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Joey Hess
Frank Lichtenheld wrote:
> I guess if we use Joey's idea at all we will not be able to avoid
> shipping such a module for each distributed VCS, and I didn't get
> the impression that Joey thought otherwise.

I do think otherwise. If the distributed (or other) VCS does not meet
our criteria for security and backwards compatability, then we should
not ship it.

And yes, it'll be up to the dpkg maintainers to enforce those criteria
if you crack open the floodgates..

> Is the last one really such a big problem in Debian? I know that many upstream
> VCS don't contain autogenerated files but most .orig.tar.gz's already
> contain them today, so I would have guessed people either only have
> their debian/ in their Debian VCS or all upstream files from the
> .orig.tar.gz.

So would I, and most of the tools like git-buildpackage seem to assume
it too and not try to support this case AFAICS. Colin's probably right
that it's an issue religious wars can be fought over, but if they're
being fought in the context of keeping package source in revision
control it's happening quietly..

-- 
see shy jo


signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Anthony Towns
On Sat, Oct 06, 2007 at 11:19:43AM -0400, Joey Hess wrote:
> Anthony Towns wrote:
> > Changes in repository formats will presumably result in versioned
> > dependencies too.
> I don't think that dpkg should add vcs formats that we don't have a good
> expectation of remaining supported by newer versions of the tools going
> forward (so svn repos are out). 

It's more that newer versions of the tools will create more optimised
repo formats, that older versions don't support -- bzr has done this
between etch and lenny, eg.

My inclination would be to have dpkg support it, but have it generate
a REJECT at upload time if we don't want to support the new format (yet).

> If the format changes in a non-backwards compatible way, we could have
> source packages built on unstable that cannot be extracted on stable,
> which I also think is suboptimal, but hard to completly avoid.

Well, that's true of any Version: 3 format already anyway.

> > Once the unpack is done, I don't see any reason why you can't do an NMU
> > in the traditional way, so presuming "dpkg-source -x" or "apt-get source"
> > handles the unpack automatically, I don't think it necessarily imposes
> > any new requirements on NMUers.
> Basically, you have to know how to git commit your changes before building
> the NMU, and that's all. As a bonus, it's rather easier to generate NMU
> patchsets. :-)

Well, there's two options:

- dpkg-source knows it's "meant to be" a git package, and
  can either warn you you have uncommitted changes (and tell
  you what to do) or just auto commit them for you

- dpkg-source doesn't know what sort of package it's meant to be
  and just builds a v1 source package

Both of which sound pretty trivial for an NMUer to deal with...

> > Maybe providing a feature on packages.debian.org (or similar) to download
> > sources in simple, non-VC, tarball format would make this a complete
> > non-issue though?
> pristine-tar could be used for this, it would just need source packages
> to put the delta somewhere standaised (under debian/), and would need 
> some standarised way to get to the upstream source branch in git.

So the logic there would be:

if there's an upstream tag, then
generate an .orig.tgz
if there's a pristine-tar info,
hax0r it to be pristine
generate a .diff.gz
if the .diff failed goto bailout
generate a .dsc containing the orig and diff
publish all three
else:
(bailout:)
generate a .tar.gz
generate a .dsc containing the tar
publish both

> > Would it make sense to have the source format look more like:
> > Format: 3.0
> > Source: dpkg
> > ...
> > Source-Depends: git-dpkg (>= 3.14159)
> > Source-Hooks: /usr/bin/git-dpkg
> > ...
> > Files:
> >  ... foo_1.2.git.tar.gz
> > You could drop the Source-Hooks: line, and just have dpkg-source know
> > to associate *.git.tar.gz with /usr/lib/dpkg/source/git, and trust the
> > package will provide it.
> Not sure if this buys anything that using perl modules for the vcses
> can't do, really. 

It doesn't buy anything extra, so forget the Source-Hooks: and just
consider it to be a different package providing the VCS-specific perl
module.

That buys you:
- no changes to dpkg to support new source formats
- easy for other distros to support more or fewer VCS formats
- version info to deal with new repo formats
- explicit dependency info that can be checked at upload time
  to block source formats we don't want to support

> How do you envision this helping deal with repository
> format changes?

Repo formats that bzr in etch can unpack could be denoted by

Source-Depends: dpkg-bzr (>= 0.11)

while repo formats that require bzr from lenny or later could be
denoted by:

Source-Depends: dpkg-bzr (>= 0.18)

(Or you could have a versioning scheme that matches the repo format
directly, rather than the program being used. Or you could use virtual
packages and say dpkg-bzr-v3 and have that be Provided: by some package/s,
etc)

It'd be straightforward to make a policy decision to only ACCEPT uploads
with given Source-Depends: lines, eg ones that can be satisfied using
packages from stable, while letting third party repos experiment with
new repo formats without needing to use a different dpkg than Debian does.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Anthony Towns
On Sat, Oct 06, 2007 at 10:37:48PM +, Colin Watson wrote:
> The second possibility seems to me to be more flexible, though, and
> probably not all that hard to implement: build both a .tar.gz
> (containing the working tree) and a .$VCS.tar.gz, and teach 'dpkg-source
> -x' to unpack the tree given at least one of these. This would allow
> various interesting possibilities such as:

Would this be better in any way than having a web interface that provides
an autogenerated version-1 source package? Presume it's a url like:

http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc

>   * Buildds could just fetch the .tar.gz; they have no need of the VCS
> data. Users who just want to inspect the current version of the
> source and not change it might want to do this too, using (say)
> 'apt-get source --no-vcs package'.

dget -x http://v1source.qa.debian.org/i/ifupdown/ifupdown_0.6.8.dsc

>   * Developers on slow connections could say 'apt-get source --vcs-only
> package' to fetch just the .$VCS.tar.gz, with the documented caveat
> that it would be just like checking the source out of a VCS in that
> you might have to recreate some autogenerated files.

That happens automatically.

>   * Space-constrained mirrors could conceivably exclude the VCS data if
> they had to, though we probably wouldn't encourage this.

Mirrors wouldn't mirror the autogenerated stuff, so not an issue.

>   * Derivative distributions who are slow to upgrade their dpkg-source
> could still interoperate to some degree.

They'd need to pull sources from the autogenerated url; though they'd
still probably have Build-Depends: issues if they're not updating
packages generally.

>   * Tools like mc, vim's tar plugin, or
> http://www.mirrorservice.org/sites/ftp.debian.org/debian/ could
> still be used straightforwardly and without modifications to look
> inside source packages on mirrors.

Again, you'd have to go to the autogenerating url rather than a mirror.

Cheers,
aj



signature.asc
Description: Digital signature


Re: [PATCH] proposed v3 source format using .git.tar.gz

2007-10-06 Thread Anthony Towns
On Fri, Oct 05, 2007 at 07:16:13PM -0400, Joey Hess wrote:
> I've been working on making dpkg-source support a new source package format
> based upon git. 

Oh, one question that comes to mind: how does this affect checking for
non-free stuff in past revisions? If 3.1-4 had some non-free files that
get reimplemented for 3.2-1, do we (a) expect the maintainer to do a
no-history upload for 3.2-1; (b) check that this happens somehow; (c) not
worry about it as long as it's only in the history; (d) something else?

Verifying that not just the current tree is DFSG-free, but all the history
is too seems potentially difficult.

Cheers,
aj



signature.asc
Description: Digital signature