re: patches-applied historical imports (usd import)

2017-01-18 Thread peter green

I haven't run dgit's dsc importer on a whole historical archive but
Peter Green of Raspian has been running it and filing bugs.  I haven't
seen such a bug recently so I hope it has been working for all the
packages he's seen.

I haven't really been doing large scale importing, I have just imported 
relatively shallow histories of a relative handful of packages.



___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-15 Thread Ian Jackson
Nish Aravamudan writes ("patches-applied historical imports (usd import)"):
> 1) Some source packages (bouncycastle, php7.0 are the ones I can think
> of off the top of my head) upstream tarballs contain .gitattributes

I investigted this.  After looking at the docs and playing about, I
concluded that the only sane approach is to record as the actual tree
object (in the git history) the contents of the Debian source package.
Otherwise it will become impossible to represent certain source
packages and all sorts of unanticipated madness could occur.

So I'm currently testing an enhancement to dgit which causes it to
 * unconditionally suprress transforming gitattributes when working
   behind the scenes to import a .dsc
 * by default, configure suppression of these attributes, when
   creating a fresh tree with `dgit clone' (and also in `dgit
   setup-new-tree')

The only other possiblity would be to apply a lossless rename
operation to all .gitattributes, which would be worse.

Ian.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-10 Thread Ian Jackson
Ian Jackson writes ("Re: patches-applied historical imports (usd import)"):
> I suggest you read "Intent to commit craziness - source package
> unpacking" et seq, which were on this list in September.  Here's the
> archive:
> 
>   
> http://lists.alioth.debian.org/pipermail/vcs-pkg-discuss/2016-September/thread.html

And after you've read that, you can see that what I actually did was
even worse.

  
https://browse.dgit.debian.org/dgit.git/commit/?id=05db7ab051736bb106467e7f83b9e45684dd227c

You will want to look at the most recent dgit HEAD for the full
algorith, because there are other workarounds for bugs in dpkg-source
etc.  Near here:

  https://browse.dgit.debian.org/dgit.git/tree/dgit#n2365

-- 
Ian Jackson <ijack...@chiark.greenend.org.uk>   These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-10 Thread Ian Jackson
Ian Jackson writes ("Re: patches-applied historical imports (usd import)"):
> Urk!  I have just filed:
...
>  https://bugs.debian.org/850845
>   dpkg-source fails to extract samba_3.6.5-2.dsc but exits status 0 - 
> https://bugs.debian.org/850845

This was actually in dget, not dpkg-source.  dpkg-source exits 2.

Ian.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-10 Thread Nish Aravamudan
On 09.01.2017 [14:33:29 -0800], Nish Aravamudan wrote:
> On 09.01.2017 [13:40:16 +], Ian Jackson wrote:
> > Nish Aravamudan writes ("patches-applied historical imports (usd import)"):



> > > ii) some patches may fail to apply with a trivial `quilt push`. This
> > > occurs with, at least, a historical publish of samba.
> > 
> > Do you have an example source package ?
> 
> I will re-run the import today and get that info for you.

src:samba 2:3.6.5-2 fails via `pull-debian-source` and (which the former
calls) `dpkg-source -x` and `quilt push`:

dpkg-source: info: the patch has fuzz which is not allowed, or is malformed
dpkg-source: info: if patch 'waf-as-source.patch' is correctly applied by 
quilt, use 'quilt refresh' to update it
dpkg-source: info: restoring quilt backup files for waf-as-source.patch
dpkg-source: error: LC_ALL=C patch -t -F 0 -N -p1 -u -V never -E -b -B 
.pc/waf-as-source.patch/ --reject-file=- < 
samba-3.6.5/debian/patches/waf-as-source.patch gave error exit status 1
pull-debian-source: Error: Source unpack failed.

Thanks,
Nish

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-09 Thread Nish Aravamudan
On 09.01.2017 [13:40:16 +], Ian Jackson wrote:
> Nish Aravamudan writes ("patches-applied historical imports (usd import)"):
> > 1) Some source packages (bouncycastle, php7.0 are the ones I can think
> > of off the top of my head) upstream tarballs contain .gitattributes
> > files, which will change the behavior of git itself when checking out a
> > branch.
> 
> I need to do some tests to see exactly how to work around this.
> Can you point me at an affected source package version ?

I believe all php7.0 source packages will have a .gitattributes entry
containing 'ident' attributes amongst others.

> > 2) How do we determine if a source package is 1.0 vs. 3.0? I am
> > currently using `dpkg-source --print-format`, but have found one source
> > package (util-linux 2.13~rc3-5), where dpkg-source emits:
> > 
> > syntax error in /tmp/tmp3y515osf/util-linux-2.13~rc3/debian/control at
> > line 14: duplicate field Depends found
> > 
> > and thus we error out.
> 
> How exciting.  dgit looks at debian/source/format, which is what
> dpkg-source reads when generating the package.

Yep, based upon Sean's advice, I will switch to using that as well.

> > 3) Imagine the following graph in the git repository:
> > 
> >A   D
> >  ->o.f>o->
> >^ .   . ^
> >|  c e  |
> >o   .   .   o
> >^. .^
> >| B |
> >o o o
> >^ ^ ^
> >| | |
> >  ->o>o>o-> 
> >a b d
> ...
> > Let's assert that there is a problem with obtaining the patches-applied
> > version of b. This can occur for (at least) the following reasons:
> > 
> > i) as in 2), we might not be able to determine the source package format
> > (implementation detail, to some degree), so are unable to correctly
> > derive if there is a patches-applied state that is distinct from b.
> 
> You can use dpkg-source -x to extract the patches-applied state
> corresponding to any .dsc, I think.  Any case where you can't is a bug
> in dpkg-source.

Right, that will get me to the fully patches-applied state, but what
about the intervening ones?

> > ii) some patches may fail to apply with a trivial `quilt push`. This
> > occurs with, at least, a historical publish of samba.
> 
> Do you have an example source package ?

I will re-run the import today and get that info for you.

> > In theory, there are other reasons/cases where this might happen and the
> > importer needs to never fail (so it is of some use to run automatically
> > :)
> 
> I haven't run dgit's dsc importer on a whole historical archive but
> Peter Green of Raspian has been running it and filing bugs.  I haven't
> seen such a bug recently so I hope it has been working for all the
> packages he's seen.

And it's also very possible that it may not be a Debian publish, but an
Ubuntu publish where this occurs :)

> > The questions I have relate to what to do when we encounter this
> > situation, which in turn is divided into two parts:
> 
> I think this situation must be excluded.
> 
> This kind of difficulty is one of the main reasons for the failure of
> previous efforts to transition universally to git.  It is why dgit
> does not currently work, in Debian, with a full history of each
> package.  Instead there's a bit of a bodge, with a locally generated
> history.  Joey Hess and I came up with this approach at the Vaumarcus
> Debconf.
> 
> Eventually I intend to import the whole history but not yet.
> 
> If you intend to go forward without fully solving this problem, your
> choices are not good.

Agreed.

> > Let's presume that this failure is not persistent and that D is able to
> > be imported successfully, we again have to make decisions about
> > parenting. I think it only makes sense for one of c or f to exist, based
> > upon what we decide is the right policy above.
> 
> This is particularly troublesome.  When you can generate the missing
> package, you will not be able to make it an ancestor of your existing
> history.
> 
> You could rewrite the existing history and then pseudomerge the old
> and new histories together, but that means duplicating most of the
> history metadata.
> 
> 
> If you wait with making an official import until it is suitable for
> use with dgit, then I might consider using the relevant parts of your
> imports history as the official Debian dgit history.
> 
> That would save you from the problem which I foresee in our future,
> where Debian and Ubuntu have disjoint imports of the same history.

Right, ideally, we'd be able (at some point in the future) perhaps stop
importing Debian manually (from Launchpad) and trust dgit or some other
source of information for the Debian history (with some translation
layer to go from a version string to a commitish in that source).

-Nish

-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org

Re: patches-applied historical imports (usd import)

2017-01-09 Thread Nish Aravamudan
Hi Sean,

On 08.01.2017 [21:34:50 -0700], Sean Whitton wrote:
> Hello Nish,
> 
> On Fri, Jan 06, 2017 at 02:26:29PM -0800, Nish Aravamudan wrote:
> > 2) How do we determine if a source package is 1.0 vs. 3.0? I am
> > currently using `dpkg-source --print-format`, but have found one source
> > package (util-linux 2.13~rc3-5), where dpkg-source emits:
> 
> I would just introspect for the debian/source/format file.  If it has a
> line "3.0 (.*)" then it's 3.0, otherwise it's 1.0 (for all packages that
> were ACCEPTed to ftp-master).

Ok, is this documented in the manual or anything (for my own
edification). I spent some time searching, but didn't find anything
definitive except for the hint provided by `dpkg-source --print-format`.
And, presumably, if a historical publish exists without any
debian/source/format file, it should be treated as 1.0? Ah, yep, I see
that documented in `man dpkg-source` under DIAGNOSTICS.

> > ii) some patches may fail to apply with a trivial `quilt push`. This
> > occurs with, at least, a historical publish of samba.
> 
> Have you considered obtaining the patches-applied tree using
> `dpkg-source -x`?  That applies the patches without using quilt(1), so
> might workaround this sort of bug (if it didn't, the package would
> FTBFS).

Well, I would do that, but afaict, there is no way to tell dpkg-source
to only apply one patch at a time? This is to go from a
patches-unapplied state to the fully patches-applied state, commiting
each patch application into the repository. Perhaps I missed a flag in
the manpage, though?

Thanks,
Nish


-- 
Nishanth Aravamudan
Ubuntu Server
Canonical Ltd

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-09 Thread Robie Basak
On Mon, Jan 09, 2017 at 05:17:08PM +, Ian Jackson wrote:
> Robie Basak writes ("Re: patches-applied historical imports (usd import)"):
> > Right now, we're accepting rich history only for Ubuntu-specific
> > commits. I don't think we have really considered yet what would be best
> > for Debian. But in the future, if Debian is interested in the same
> > mechanism, then the same would apply to Debian's ancestory trees. I
> > don't see any reason why it wouldn't make sense for Debian to do the
> > same thing here.
> 
> When Debian has its own complete and ongoing git history, with a
> mixture of rich and imported histories, you will obviously want to
> fold that into your ongoing Ubuntu history.  How do you plan to do
> that ?

Using the same "adopt rich history" mechanism. Give the importer, in
advance, a map of (srcpkgname, versionstring) -> (commithash). When the
importer imports, if the map lookup succeeds, the corresponding commit
is added as an additional parent.

Then the question just becomes: whence do we get this map? Right now one
mechanism is that the uploader provides it to the importer repository in
advance via a tag. We could also use the dgit-like mechanism of putting
a commit hash in the changes file together with some known place to find
that commit object.

Further mechanisms might be able to locate a commit object by following
a Vcs-Git header and expecting a dep14 tag there, though I think this
needs some thought about safety and edge cases (repository unavailable,
etc).

Ultimately either we agree on a known way to find the commit objects, or
we rely on uploaders to provide them, or we do not get the rich history.

If Debian does not provide a mechanism for the importer to pick this up
automatically (or we cannot use it for whatever reason), an Ubuntu
uploader could still manually pull in Debian VCS history by preparing a
commit that uses Debian VCS in its history, and providing that as the
"rich history commit object" to the importer before uploading to Ubuntu.

Does this answer your question? I'm not sure what else you're looking
for here.

> > I believe it does currently, unless Nish steps in to correct me. So this
> > part could indeed be the same. However, I think the same rich history
> > case above applies here too. We're not doing it right now, but if a
> > maintainer wants to supply rich upstream commits (for example by
> > connecting upstream's VCS to the commit graph), then I think this is
> > something the importer could support. And in this case, the commit
> > hashes would start to mismatch.
> 
> Indeed.  I don't think this can be made perfect but the more we make
> it similar the better.
> 
> dgit's imports of tarball are always origin commits, with a separate
> commit to stitch them into history.  That allows for a different
> import of the same tarball to have different parents, but still share
> the same tarball origin commit.

Ah. We could do the same then, to use the same tarball origin commit
hash. That might be useful.

Robie


signature.asc
Description: PGP signature
___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss

Re: patches-applied historical imports (usd import)

2017-01-09 Thread Robie Basak
On Mon, Jan 09, 2017 at 02:02:34PM +, Ian Jackson wrote:
> Robie Basak writes ("Re: patches-applied historical imports (usd import)"):
> > On Mon, Jan 09, 2017 at 01:45:52PM +, Ian Jackson wrote:
> > > Ideally you and I could agree on a commit structure for the imported
> > > packages, and metadata processing rules, so that the commits are
> > > identical (not just the trees).
> > 
> > By "commits are identical", are you including parent commit hashes? Or
> > just everything else?
> 
> It's true that not all of the parent hashes could be identical.

Right. In particular, our importer has two sources of input for the
unapplied branches: archive history, and rich history supplied by
developers. Commits generated by the importer end up having parents from
both of these sources (subject to source availability).

The rich history input source breaks the "reproducible commit hash"
invariant somewhat. If identical rich history is available on re-import
then the importer should produce the same result commit hash, but the
timing of supply of the rich history affects things. For example, if an
uploader forgets to give us some rich history but supplies it months
later, we still want to keep it but it will not have been adopted into
the graph of commits by the importer. But the importer would take it on
a re-import, thus mutating all subsequent commit hashes.

Right now, we're accepting rich history only for Ubuntu-specific
commits. I don't think we have really considered yet what would be best
for Debian. But in the future, if Debian is interested in the same
mechanism, then the same would apply to Debian's ancestory trees. I
don't see any reason why it wouldn't make sense for Debian to do the
same thing here.

The applied branches (created on your request) has the unapplied branch
as a parent (sort of like git-dpm does), so the same
non-reproducible-ness filters through.

> dgit imports orig tarballs in a way that is supposed to produce a
> stable commit hash for each tarball, so that different revisions of
> the same upstream version have the upstream as a common ancestor.
> 
> Does Nish's importer try to generate a fast-forwarding upstream
> branch ?  If not then part of the imported commit structure could be
> the same.

I believe it does currently, unless Nish steps in to correct me. So this
part could indeed be the same. However, I think the same rich history
case above applies here too. We're not doing it right now, but if a
maintainer wants to supply rich upstream commits (for example by
connecting upstream's VCS to the commit graph), then I think this is
something the importer could support. And in this case, the commit
hashes would start to mismatch.

Robie


signature.asc
Description: PGP signature
___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss

Re: patches-applied historical imports (usd import)

2017-01-09 Thread Ian Jackson
Nish Aravamudan writes ("patches-applied historical imports (usd import)"):
> [stuff]

It occurs to me that it would be a good idea for you to check that
your importer produces identical answers to dgit.  If nothing else,
from your point of view dgit can help you test that your importer
DTRT.

At the very least, for every .dsc which is successfully imported by
both dgit and your tool, the resulting imported tree object should be
identical.

Since I think dgit should be able to import every .dsc in the whole of
history (given new enough underlying tools), I would encourage you to
report every import failure as a bug against dgit.  When considering
such bugs:
 * I do not mind if you do not repro the bug on Debian
 * Point me at the exact .dsc
 * Run dgit with -D and include the output in the bug report
 * Use Severity: important

Ideally you and I could agree on a commit structure for the imported
packages, and metadata processing rules, so that the commits are
identical (not just the trees).

Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-09 Thread Ian Jackson
Nish Aravamudan writes ("patches-applied historical imports (usd import)"):
> 1) Some source packages (bouncycastle, php7.0 are the ones I can think
> of off the top of my head) upstream tarballs contain .gitattributes
> files, which will change the behavior of git itself when checking out a
> branch.

I need to do some tests to see exactly how to work around this.
Can you point me at an affected source package version ?

> 2) How do we determine if a source package is 1.0 vs. 3.0? I am
> currently using `dpkg-source --print-format`, but have found one source
> package (util-linux 2.13~rc3-5), where dpkg-source emits:
> 
> syntax error in /tmp/tmp3y515osf/util-linux-2.13~rc3/debian/control at
> line 14: duplicate field Depends found
> 
> and thus we error out.

How exciting.  dgit looks at debian/source/format, which is what
dpkg-source reads when generating the package.

> 3) Imagine the following graph in the git repository:
> 
>A   D
>  ->o.f>o->
>^ .   . ^
>|  c e  |
>o   .   .   o
>^. .^
>| B |
>o o o
>^ ^ ^
>| | |
>  ->o>o>o-> 
>a b d
...
> Let's assert that there is a problem with obtaining the patches-applied
> version of b. This can occur for (at least) the following reasons:
> 
> i) as in 2), we might not be able to determine the source package format
> (implementation detail, to some degree), so are unable to correctly
> derive if there is a patches-applied state that is distinct from b.

You can use dpkg-source -x to extract the patches-applied state
corresponding to any .dsc, I think.  Any case where you can't is a bug
in dpkg-source.

> ii) some patches may fail to apply with a trivial `quilt push`. This
> occurs with, at least, a historical publish of samba.

Do you have an example source package ?

> In theory, there are other reasons/cases where this might happen and the
> importer needs to never fail (so it is of some use to run automatically
> :)

I haven't run dgit's dsc importer on a whole historical archive but
Peter Green of Raspian has been running it and filing bugs.  I haven't
seen such a bug recently so I hope it has been working for all the
packages he's seen.

> The questions I have relate to what to do when we encounter this
> situation, which in turn is divided into two parts:

I think this situation must be excluded.

This kind of difficulty is one of the main reasons for the failure of
previous efforts to transition universally to git.  It is why dgit
does not currently work, in Debian, with a full history of each
package.  Instead there's a bit of a bodge, with a locally generated
history.  Joey Hess and I came up with this approach at the Vaumarcus
Debconf.

Eventually I intend to import the whole history but not yet.

If you intend to go forward without fully solving this problem, your
choices are not good.

> Let's presume that this failure is not persistent and that D is able to
> be imported successfully, we again have to make decisions about
> parenting. I think it only makes sense for one of c or f to exist, based
> upon what we decide is the right policy above.

This is particularly troublesome.  When you can generate the missing
package, you will not be able to make it an ancestor of your existing
history.

You could rewrite the existing history and then pseudomerge the old
and new histories together, but that means duplicating most of the
history metadata.


If you wait with making an official import until it is suitable for
use with dgit, then I might consider using the relevant parts of your
imports history as the official Debian dgit history.

That would save you from the problem which I foresee in our future,
where Debian and Ubuntu have disjoint imports of the same history.


Ian.

-- 
Ian Jackson    These opinions are my own.

If I emailed you from an address @fyvzl.net or @evade.org.uk, that is
a private address which bypasses my fierce spamfilter.

___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss


Re: patches-applied historical imports (usd import)

2017-01-08 Thread Sean Whitton
Hello Nish,

On Fri, Jan 06, 2017 at 02:26:29PM -0800, Nish Aravamudan wrote:
> 2) How do we determine if a source package is 1.0 vs. 3.0? I am
> currently using `dpkg-source --print-format`, but have found one source
> package (util-linux 2.13~rc3-5), where dpkg-source emits:

I would just introspect for the debian/source/format file.  If it has a
line "3.0 (.*)" then it's 3.0, otherwise it's 1.0 (for all packages that
were ACCEPTed to ftp-master).

> ii) some patches may fail to apply with a trivial `quilt push`. This
> occurs with, at least, a historical publish of samba.

Have you considered obtaining the patches-applied tree using
`dpkg-source -x`?  That applies the patches without using quilt(1), so
might workaround this sort of bug (if it didn't, the package would
FTBFS).

-- 
Sean Whitton


signature.asc
Description: PGP signature
___
vcs-pkg-discuss mailing list
vcs-pkg-discuss@lists.alioth.debian.org
http://lists.alioth.debian.org/cgi-bin/mailman/listinfo/vcs-pkg-discuss