Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-31 Thread David Kalnischkies
On Thu, Jan 30, 2014 at 03:42:13PM +0100, Julian Andres Klode wrote:
> On Thu, Jan 30, 2014 at 12:27:21PM +, Wookey wrote:
> > +++ Julian Andres Klode [2014-01-30 08:12 +0100]:
> > > On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
> > The problem is that in order to debootstrap you need all the packages in
> > one repo so leaving the arch all packages in ftp.uk.debian.org means you
> > can't debootstrap if you only uploaded the new-arch 'any' packages to
> > the 'bootstrap' repo. It's also important to test that the arch-all
> > build actually works, and not just the arch-any part so doing those
> > builds and testing the results can be good. 
> 
> A work around might be to reorder sources.list entries. The order of
> those entries determines from which source a package is retrieved, I
> believe the first match takes precedence.

The first one parsed decides which size is expected – and usually this is
also the one the package is acquired from, with the notable exception of
not downloading from an unsigned archive if a signed is available…
so, as this bootstrap archive is signed, is the key installed?


> > It's fine for apt to consider these packages to be functionally
> > equivalent, but it does need to check the correct checksum on download.
> > It seems to me that this can be fixed by either adding size/hash to the
> > hash as you suggest(making them 'different packages', or just separately
> > ensuring that the checksum for the repo/file that was downloaded is
> > used. Apt knows that there is more than one repo source for this
> > package, but doesn't record that there might be more than one checksum?
> > The fact that it can end up choosing one checksum and another source
> > does seem wrong. Perhaps the code/object structure makes it hard to fix
> > this this way and your fix is the only one that makes sense?
> 
> It seems right to me in this case, because otherwise functional aspects
> like dependencies could differ as well. And if APT uses the dependencies
> from one source and then fetches the package from another source, but that
> one has different dependencies, installing it would produce an error.

This situation can't happen as you have yourself lined out that Depends
will influence the CRC hash, so they would get recognized as different
versions. That said, what could happen at the moment is that a package
could differ just by Multi-Arch field.
(minus hash collisions, but how likely is that…)

> > > An alternative would be to change the cache-building algorithms to look
> > > at SHA hashes and/or size and create different version entries in the 
> > > cache
> > > if they are present in both versions, but different. SHA Hashes would 
> > > require
> > > all repositories to use the same best checksum algorithm.
> > 
> > I think just adding size to the hash would be cheap and easy and would
> > largely solve this problem. Adding the hash would cover a few extra
> > cases where the size came out the same too, but if it's difficult I'd be
> > happy to have this mostly-solved, as it's a situation we normally try to
> > avoid anyway.
> 
> Adding the size to the hash is not possible, as dpkg does not store the
> size for installed packages. This would mean that an installed package
> always has a different hash than an available package, causing APT to
> go crazy (it would try to "upgrade" all installed packages...).

We could compare the size of the currently parsed version with the size
of the version we compare it with at the moment through (as long as the
current one isn't the status file one). See attached demo-patch.
Something like that (but tested, this one isn't) could be introduced
with the next abi break. It isn't bulletproof either, but a bit better.

(I wonder if it would make sense to move the comparison entirely into
 such an on-demand handling rather than this generate CRC for everyone.)


Best regards

David Kalnischkies
diff --git a/apt-pkg/deb/deblistparser.cc b/apt-pkg/deb/deblistparser.cc
index 68d544e..4fe5919 100644
--- a/apt-pkg/deb/deblistparser.cc
+++ b/apt-pkg/deb/deblistparser.cc
@@ -95,44 +95,51 @@ string debListParser::Version()
return Section.FindS("Version");
 }
 	/*}}}*/
-// ListParser::NewVersion - Fill in the version structure		/*{{{*/
-// -
-/* */
-bool debListParser::NewVersion(pkgCache::VerIterator &Ver)
+unsigned char debListParser::ParseMultiArch(bool const showErrors)	/*{{{*/
 {
-   // Parse the section
-   Ver->Section = UniqFindTagWrite("Section");
-
-   // Parse multi-arch
+   unsigned char MA;
string const MultiArch = Section.FindS("Multi-Arch");
if (MultiArch.empty() == true)
-  Ver->MultiArch = pkgCache::Version::None;
+  MA = pkgCache::Version::None;
else if (MultiArch == "same") {
   // Parse multi-arch
   if (ArchitectureAll() == true)
   {
 	 /* Arch all packages can't be Multi-Arch: same */
-	 _error->Warning("Architecture: a

Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-30 Thread Wookey
+++ Julian Andres Klode [2014-01-30 15:42 +0100]:
> On Thu, Jan 30, 2014 at 12:27:21PM +, Wookey wrote:
> > +++ Julian Andres Klode [2014-01-30 08:12 +0100]:
> > > On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
> > > > Package: apt
> > > > Version: 0.9.15
> > > > Severity: important
> > > > 
> > > > In the sources I have my own bootstrap repository containing a lot of
> > > > (unstable) packages built for arm64, and plain debian unstable and 
> > > > saucy repos
> > > > 
> > > > apt-get install(that is available in all 3 repos)
> > > > results in a size mismatch error. It seems that apt is using the
> > > > checksum from one repo but downloading the package from another.
> > > > 
> > > > The packages used is just an example it seems to be the same for any 
> > > > arch all package
> > > > 
> > > > (debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
> > > > x11proto-scrnsaver-dev:
> > > >   Installed: (none)
> > > >   Candidate: 1.2.2-1
> > > >   Version table:
> > > >  1.2.2-1 0
> > > > 500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
> > > > debianstrap/main arm64 Packages
> > > > 500 http://ftp.uk.debian.org/debian/ unstable/main amd64 
> > > > Packages
> > > > 500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 
> > > > Packages
> > > > 
> > > 
> > > Right, and that's a problem, as having two different packages with the
> > > same version is not really supported. 

> > OK. That makes sense. I see what's going on now. 
> > 
> > The problem is that in order to debootstrap you need all the packages in
> > one repo so leaving the arch all packages in ftp.uk.debian.org means you
> > can't debootstrap if you only uploaded the new-arch 'any' packages to
> > the 'bootstrap' repo. It's also important to test that the arch-all
> > build actually works, and not just the arch-any part so doing those
> > builds and testing the results can be good. 
> 
> A work around might be to reorder sources.list entries. The order of
> those entries determines from which source a package is retrieved, I
> believe the first match takes precedence.

Ha. That does indeed provide a working workaround :-)

Moving the repo that the 'all' packages are being download from, to the
top of the list makes it work.

so moving ftp.uk.debian.org above p.d.o/~wookey/bootstrap

# apt-cache policy x11proto-scrnsaver-dev
x11proto-scrnsaver-dev:
  Installed: 1.2.2-1
  Candidate: 1.2.2-1
  Version table:
 *** 1.2.2-1 0
   550 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
  1001 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
debianstrap/main arm64 Packages
   500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages
  100 /var/lib/dpkg/status

means that both downloads and checksums come from ftp.uk.debian.org

I'll use it like this for a bit and see if it always works, or just sometimes 
:-)

This isn't really any sort of actual 'solution', but it's a very handy 
suggestion.

Wookey
-- 
Principal hats:  Linaro, Emdebian, Wookware, Balloonboard, ARM
http://wookware.org/


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-30 Thread Julian Andres Klode
On Thu, Jan 30, 2014 at 12:27:21PM +, Wookey wrote:
> +++ Julian Andres Klode [2014-01-30 08:12 +0100]:
> > On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
> > > Package: apt
> > > Version: 0.9.15
> > > Severity: important
> > > 
> > > In the sources I have my own bootstrap repository containing a lot of
> > > (unstable) packages built for arm64, and plain debian unstable and saucy 
> > > repos
> > > 
> > > apt-get install(that is available in all 3 repos)
> > > results in a size mismatch error. It seems that apt is using the
> > > checksum from one repo but downloading the package from another.
> > > 
> > > The packages used is just an example it seems to be the same for any arch 
> > > all package
> > > 
> > > (debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
> > > x11proto-scrnsaver-dev:
> > >   Installed: (none)
> > >   Candidate: 1.2.2-1
> > >   Version table:
> > >  1.2.2-1 0
> > > 500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
> > > debianstrap/main arm64 Packages
> > > 500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
> > > 500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 
> > > Packages
> > > 
> > 
> > Right, and that's a problem, as having two different packages with the
> > same version is not really supported. APT differentiates packages with the
> > same version by CRC-16 hashing the fields
> > Installed-Size
> > Depends
> > Pre-Depends
> > Conflicts
> > Breaks
> > Replaces
> > in order to handle packages where those are the same APT would need to hash
> > size or SHA hash as well, but this fails for installed packages, as this
> > information is not provided in /var/lib/dpkg/status.
> 
> OK. That makes sense. I see what's going on now. 
> 
> Which of course if why we do -B builds for other architectures and
> carefully ensure there is only one copy of the arch all packages.
> 
> 
> The problem is that in order to debootstrap you need all the packages in
> one repo so leaving the arch all packages in ftp.uk.debian.org means you
> can't debootstrap if you only uploaded the new-arch 'any' packages to
> the 'bootstrap' repo. It's also important to test that the arch-all
> build actually works, and not just the arch-any part so doing those
> builds and testing the results can be good. 

A work around might be to reorder sources.list entries. The order of
those entries determines from which source a package is retrieved, I
believe the first match takes precedence.

> 
> It's fine for apt to consider these packages to be functionally
> equivalent, but it does need to check the correct checksum on download.
> It seems to me that this can be fixed by either adding size/hash to the
> hash as you suggest(making them 'different packages', or just separately
> ensuring that the checksum for the repo/file that was downloaded is
> used. Apt knows that there is more than one repo source for this
> package, but doesn't record that there might be more than one checksum?
> The fact that it can end up choosing one checksum and another source
> does seem wrong. Perhaps the code/object structure makes it hard to fix
> this this way and your fix is the only one that makes sense?

It seems right to me in this case, because otherwise functional aspects
like dependencies could differ as well. And if APT uses the dependencies
from one source and then fetches the package from another source, but that
one has different dependencies, installing it would produce an error.

> 
> > An alternative would be to change the cache-building algorithms to look
> > at SHA hashes and/or size and create different version entries in the cache
> > if they are present in both versions, but different. SHA Hashes would 
> > require
> > all repositories to use the same best checksum algorithm.
> 
> I think just adding size to the hash would be cheap and easy and would
> largely solve this problem. Adding the hash would cover a few extra
> cases where the size came out the same too, but if it's difficult I'd be
> happy to have this mostly-solved, as it's a situation we normally try to
> avoid anyway.

Adding the size to the hash is not possible, as dpkg does not store the
size for installed packages. This would mean that an installed package
always has a different hash than an available package, causing APT to
go crazy (it would try to "upgrade" all installed packages...).

David or Michael probably have some more ideas.

-- 
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

Please do not top-post if possible.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-30 Thread Wookey
+++ Julian Andres Klode [2014-01-30 08:12 +0100]:
> On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
> > Package: apt
> > Version: 0.9.15
> > Severity: important
> > 
> > In the sources I have my own bootstrap repository containing a lot of
> > (unstable) packages built for arm64, and plain debian unstable and saucy 
> > repos
> > 
> > apt-get install(that is available in all 3 repos)
> > results in a size mismatch error. It seems that apt is using the
> > checksum from one repo but downloading the package from another.
> > 
> > The packages used is just an example it seems to be the same for any arch 
> > all package
> > 
> > (debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
> > x11proto-scrnsaver-dev:
> >   Installed: (none)
> >   Candidate: 1.2.2-1
> >   Version table:
> >  1.2.2-1 0
> > 500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
> > debianstrap/main arm64 Packages
> > 500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
> > 500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages
> > 
> 
> Right, and that's a problem, as having two different packages with the
> same version is not really supported. APT differentiates packages with the
> same version by CRC-16 hashing the fields
>   Installed-Size
>   Depends
>   Pre-Depends
>   Conflicts
>   Breaks
>   Replaces
> in order to handle packages where those are the same APT would need to hash
> size or SHA hash as well, but this fails for installed packages, as this
> information is not provided in /var/lib/dpkg/status.

OK. That makes sense. I see what's going on now. 

Which of course if why we do -B builds for other architectures and
carefully ensure there is only one copy of the arch all packages.


The problem is that in order to debootstrap you need all the packages in
one repo so leaving the arch all packages in ftp.uk.debian.org means you
can't debootstrap if you only uploaded the new-arch 'any' packages to
the 'bootstrap' repo. It's also important to test that the arch-all
build actually works, and not just the arch-any part so doing those
builds and testing the results can be good. 

It's fine for apt to consider these packages to be functionally
equivalent, but it does need to check the correct checksum on download.
It seems to me that this can be fixed by either adding size/hash to the
hash as you suggest(making them 'different packages', or just separately
ensuring that the checksum for the repo/file that was downloaded is
used. Apt knows that there is more than one repo source for this
package, but doesn't record that there might be more than one checksum?
The fact that it can end up choosing one checksum and another source
does seem wrong. Perhaps the code/object structure makes it hard to fix
this this way and your fix is the only one that makes sense?

> An alternative would be to change the cache-building algorithms to look
> at SHA hashes and/or size and create different version entries in the cache
> if they are present in both versions, but different. SHA Hashes would require
> all repositories to use the same best checksum algorithm.

I think just adding size to the hash would be cheap and easy and would
largely solve this problem. Adding the hash would cover a few extra
cases where the size came out the same too, but if it's difficult I'd be
happy to have this mostly-solved, as it's a situation we normally try to
avoid anyway.

I am clueless about the apt codebase (and C++ if it's not fairly 'C'-ey)
but am prepared to take a stab at this if you give me a clue where to
look.

thanks for the quick response.

Wookey
-- 
Principal hats:  Linaro, Emdebian, Wookware, Balloonboard, ARM
http://wookware.org/


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-29 Thread Julian Andres Klode
On Thu, Jan 30, 2014 at 03:13:16AM +, Wookey wrote:
> Package: apt
> Version: 0.9.15
> Severity: important
> 
> In the sources I have my own bootstrap repository containing a lot of
> (unstable) packages built for arm64, and plain debian unstable and saucy repos
> 
> apt-get install(that is available in all 3 repos)
> results in a size mismatch error. It seems that apt is using the
> checksum from one repo but downloading the package from another.
> 
> The packages used is just an example it seems to be the same for any arch all 
> package
> 
> (debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
> x11proto-scrnsaver-dev:
>   Installed: (none)
>   Candidate: 1.2.2-1
>   Version table:
>  1.2.2-1 0
> 500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
> debianstrap/main arm64 Packages
> 500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
> 500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages
> 

Right, and that's a problem, as having two different packages with the
same version is not really supported. APT differentiates packages with the
same version by CRC-16 hashing the fields
Installed-Size
Depends
Pre-Depends
Conflicts
Breaks
Replaces
in order to handle packages where those are the same APT would need to hash
size or SHA hash as well, but this fails for installed packages, as this
information is not provided in /var/lib/dpkg/status.

An alternative would be to change the cache-building algorithms to look
at SHA hashes and/or size and create different version entries in the cache
if they are present in both versions, but different. SHA Hashes would require
all repositories to use the same best checksum algorithm.

-- 
Julian Andres Klode  - Debian Developer, Ubuntu Member

See http://wiki.debian.org/JulianAndresKlode and http://jak-linux.org/.

Please do not top-post if possible.


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#737085: apt: Apt downloads arch all packages from wrong repo/checks wrong checksum

2014-01-29 Thread Wookey
Package: apt
Version: 0.9.15
Severity: important

In the sources I have my own bootstrap repository containing a lot of
(unstable) packages built for arm64, and plain debian unstable and saucy repos

apt-get install(that is available in all 3 repos)
results in a size mismatch error. It seems that apt is using the
checksum from one repo but downloading the package from another.

The packages used is just an example it seems to be the same for any arch all 
package

(debian-arm64)# apt-cache policy x11proto-scrnsaver-dev
x11proto-scrnsaver-dev:
  Installed: (none)
  Candidate: 1.2.2-1
  Version table:
 1.2.2-1 0
500 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
debianstrap/main arm64 Packages
500 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages

#apt-get install x11proto-scrnsaver-dev
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following NEW packages will be installed:
  x11proto-scrnsaver-dev
0 upgraded, 1 newly installed, 0 to remove and 118 not upgraded.
Need to get 22.3 kB of archives.
After this operation, 106 kB of additional disk space will be used.
Get:1 http://ftp.uk.debian.org/debian/ unstable/main x11proto-scrnsaver-dev all 
1.2.2-1 [22.3 kB]
Fetched 25.0 kB in 0s (1526 kB/s) 
E: Failed to fetch 
http://ftp.uk.debian.org/debian/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
  Size mismatch

wget 
http://ftp.uk.debian.org/debian/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
wget 
http://people.debian.org/~wookey/bootstrap/debianrepo2/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
This is the one from ftp.uk.debian.org:
(debian-arm64)# md5sum x11proto-scrnsaver-dev_1.2.2-1_all.deb
fc8b3d0bc4c7e7aefa0177d94382adc4  x11proto-scrnsaver-dev_1.2.2-1_all.deb
This is the one from people.debian.org:
(debian-arm64)# md5sum x11proto-scrnsaver-dev_1.2.2-1_all.deb.1 
842270da2db205f3819a4dbaf4a75658  x11proto-scrnsaver-dev_1.2.2-1_all.deb.1

looking in the packages files those numbers are correct:
/var/lib/apt/lists/ftp.uk.debian.org_debian_dists_unstable_main_binary-amd64_Packages
MD5sum: fc8b3d0bc4c7e7aefa0177d94382adc4
SHA1: 5660bef42accd401efc3a04056330a9e34cbaf2d
SHA256: 505bb5098c80355c4474df5c8b3677fe1fda74764a52a29f7afca8e3df0603ad

/var/lib/apt/lists/people.debian.org_%7ewookey_bootstrap_debianrepo2_dists_debianstrap_main_binary-arm64_Packages
SHA256: e00c64cd6cab5e0eef91fb18440ec78827aeeb6452f79f450fb37acaa16f7984
SHA1: 83177ab07be653b427cb3d0d94a05f47f4a49a87
MD5sum: 842270da2db205f3819a4dbaf4a75658

So there is no reason why it should be saying 'size mismatch'.
A clue may be that if we set some pinning the 'wrong' .deb gets downloaded:

# apt-cache policy x11proto-scrnsaver-dev
x11proto-scrnsaver-dev:
  Installed: (none)
  Candidate: 1.2.2-1
  Version table:
 1.2.2-1 0
   1001 http://people.debian.org/~wookey/bootstrap/debianrepo2/ 
debianstrap/main arm64 Packages
   550 http://ftp.uk.debian.org/debian/ unstable/main amd64 Packages
   500 http://ports.ubuntu.com/ubuntu-ports/ saucy/main arm64 Packages

# apt-get install x11proto-scrnsaver-dev
Reading package lists... Done
Building dependency tree   
Reading state information... Done
The following NEW packages will be installed:
  x11proto-scrnsaver-dev
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 22.3 kB of archives.
After this operation, 106 kB of additional disk space will be used.
Get:1 http://ftp.uk.debian.org/debian/ unstable/main x11proto-scrnsaver-dev all 
1.2.2-1 [22.3 kB]
Fetched 25.0 kB in 0s (0 B/s)
E: Failed to fetch 
http://ftp.uk.debian.org/debian/pool/main/x/x11proto-scrnsaver/x11proto-scrnsaver-dev_1.2.2-1_all.deb
  Size mismatch

Should it not choose the repo with the highest pinning?
Is it getting the MD5SUM from one source but the binary from another?

If I remove 2 of the sources so that only one is available, then
x11proto-scrnsaver-dev is downloaded and installed OK.

#apt-get install  x11proto-scrnsaver-dev/debianstrap
still downloads the one from  http://ftp.uk.debian.org/debian/ and still gets 
the size mismatch
Specifying a codename just affects the version selection, not where it
is downloaded from (which would be fine if it checked the right checksum
:)

# apt-get install  x11proto-scrnsaver-dev/debianstrap 
Reading package lists... Done
Building dependency tree   
Reading state information... Done
Selected version '1.2.2-1' (Multiarch native-bootstrap 
packages:people.debian.org, Debian:unstable [all]) for 'x11proto-scrnsaver-dev'
The following NEW packages will be installed:
  x11proto-scrnsaver-dev
0 upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 22.3 kB of archives.
After this operation, 106 kB of additional disk space will be used.
Get:1 http://ftp.uk.debian.org/debian/ unstable/main x11proto-