Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-19 Thread Jonathan Nieder
Package: cupt
Version: 1.5.11

Every once in a while (mirror sync-related?), cupt update is failing:

 # cupt update
 [...]
 Get:6 http://ftp.us.debian.org/debian experimental Release.gpg 
 
 Get:7 http://ftp.us.debian.org/debian experimental/main Packages.bz2 [257KiB]  
 
 54% [7 experimental/main Packages.bz2 0B/257KiB 0%]| 75.7KiB/s | 
ETA: 4s
 W: downloading 
http://ftp.us.debian.org/debian/dists/experimental/main/binary-i386/Packages.bz2
 failed: invalid size: expected '262710', got '260345'
 [...]
 Fetched 10.7MiB in 43s.
 
 #
 # cupt update
 [...]
 Get:6 http://ftp.us.debian.org/debian sid Release.gpg  
 
 100% [6 sid Release.gpg 0B/835B 0%]| 77.9KiB/s | 
ETA: 0s
 W: gpg: '/var/lib/apt/lists/ftp.us.debian.org_debian_dists_sid_Release': bad 
signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) 

 W: signature verification for 'sid Release' failed
 Get:7 http://ftp.us.debian.org/debian sid/main Packages.bz2 [6495KiB]  
 
 Get:8 http://ftp.us.debian.org/debian sid/main Translation-en_US.bz2   
 
 100% [8 sid/main Translation-en_US.bz2 0B]  | 0B/s | 
ETA: 0s
 W: downloading 
http://ftp.us.debian.org/debian/dists/sid/main/i18n/Translation-en_US.bz2 
failed: HTTP response code said error: 404
 [...]
 Fetched 6798KiB in 28s.
 
 # date -u
 Thu May 20 06:05:55 UTC 2010

I am not sure what is actually behind this but thought I should get
your advice.  Is this a known problem?  Could cupt help diagnose it
more easily?

Workaround: use apt-get update to get the pdiffs.

Jonathan



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-20 Thread Eugene V. Lyubimkin
Jonathan Nieder wrote:
> Package: cupt
> Version: 1.5.11
>=20
> Every once in a while (mirror sync-related?), cupt update is failing:
>=20
>  # cupt update
>  [...]
>  Get:6 http://ftp.us.debian.org/debian experimental Release.gpg=
 =20
>  Get:7 http://ftp.us.debian.org/debian experimental/main Packages.bz2 [=
257KiB]  =20
>  54% [7 experimental/main Packages.bz2 0B/257KiB 0%]| 75.7K=
iB/s | ETA: 4s
>  W: downloading http://ftp.us.debian.org/debian/dists/experimental/main=
/binary-i386/Packages.bz2 failed: invalid size: expected '262710', got '2=
60345'
>  [...]
>  Fetched 10.7MiB in 43s.   =
 =20
>  #
Firstly, I would like to confirm that even in that case, the whole update=

thing is run fully, because the next lines should indicate downloading
Packages.gz for the same entry and succeed (or, hm, possibly, fail too). =
Is it
the case?

> I am not sure what is actually behind this but thought I should get
> your advice.  Is this a known problem?
No. This is the first time I see this error popped out. My first guess is=

mirror is misbehaving. Personally I didn't use ftp.us.debian.org at least=
 for
year and some months, I however use several other ones without errors lik=
e this.

> Could cupt help diagnose it
> more easily?
Well, the problem is clear: the size specified in Release file didn't mat=
ch
the actual size of Packages.{ext}. This may also mean security problems, =
so
Cupt won't download anything with modified sizes.

Some time ago I asked a FTP team about spec on possible detecting of
updating-mirror-is-in-progress by using a file which they apparently plac=
e
temporary to some place, but my mail didn't get any answer, and since tha=
t I
did not encounter this problem and forgot about it.

So. The first thing I propose to do is verify that the problem is
mirror-dependent, for example, by trying using other non-US mirror for so=
me
time. This would not necessarily mean that problem is not in Cupt, but he=
lp to
"bisect" what's going on.

--=20
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc
Description: OpenPGP digital signature


Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-20 Thread Jonathan Nieder
Eugene V. Lyubimkin wrote:

> Firstly, I would like to confirm that even in that case, the whole update
> thing is run fully, because the next lines should indicate downloading
> Packages.gz for the same entry and succeed (or, hm, possibly, fail too).
> Is it the case?

Oh!  Yes, that’s right; it succeeds.

Get:1 http://ftp.us.debian.org/debian sid Release
Get:2 http://ftp.us.debian.org/debian experimental Release
Get:3 http://ftp.egr.msu.edu/debian sid Release
Get:4 http://ftp.us.debian.org/debian experimental Release.gpg
Get:5 http://ftp.us.debian.org/debian sid Release.gpg
100% [4 experimental Release.gpg 835B/835B 100%][5 sid Release.| 76.5KiB/s | 
ETA: 0s
W: gpg: 
'/var/lib/apt/lists/ftp.us.debian.org_debian_dists_experimental_Release': bad 
signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) 

W: signature verification for 'experimental Release' failed
Get:6 http://ftp.egr.msu.edu/debian sid Release.gpg
Get:7 http://ftp.us.debian.org/debian experimental/main Packages.bz2 [254KiB]
54% [7 experimental/main Packages.bz2 0B/254KiB 0%]| 77.9KiB/s | 
ETA: 2s
W: downloading 
http://ftp.us.debian.org/debian/dists/experimental/main/binary-i386/Packages.bz2
 failed: invalid size: expected '260345', got '260479'
Get:8 http://ftp.us.debian.org/debian experimental/main Packages.gz [314KiB]
[...]

Trying ‘cupt update’ again:

Get:1 http://ftp.us.debian.org/debian sid Release
Get:2 http://ftp.us.debian.org/debian experimental Release
Get:3 http://ftp.egr.msu.edu/debian sid Release
Get:4 http://ftp.us.debian.org/debian sid Release.gpg
76% [3 sid Release 28.0KiB/101KiB 28%][4 sid Release.gpg 835B/8| 57.0KiB/s | 
ETA: 0s
W: gpg: '/var/lib/apt/lists/ftp.us.debian.org_debian_dists_sid_Release': bad 
signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) 

W: signature verification for 'sid Release' failed
Get:5 http://ftp.us.debian.org/debian experimental Release.gpg
Get:6 http://ftp.egr.msu.edu/debian sid Release.gpg
100% [5 experimental Release.gpg 835B/835B 100%]   | 77.5KiB/s | 
ETA: 0s
W: gpg: 
'/var/lib/apt/lists/ftp.us.debian.org_debian_dists_experimental_Release': bad 
signature: 9AA38DCD55BE302B Debian Archive Automatic Signing Key (5.0/lenny) 

W: signature verification for 'experimental Release' failed
Get:7 http://ftp.us.debian.org/debian sid/main Packages.bz2 [6495KiB]
4% [7 sid/main Packages.bz2 0B/6495KiB 0%]  | 25.0KiB/s | ETA: 
1m47s
W: downloading 
http://ftp.us.debian.org/debian/dists/sid/main/binary-i386/Packages.bz2 failed: 
invalid size: expected '6651138', got '6650451'
Get:8 http://ftp.us.debian.org/debian sid/main Packages.gz [8542KiB]
3% [8 sid/main Packages.gz 0B/8542KiB 0%]  | 403B/s | ETA: 
2m49s
W: downloading 
http://ftp.us.debian.org/debian/dists/sid/main/binary-i386/Packages.gz failed: 
invalid size: expected '8747079', got '8747611'
Get:9 http://ftp.us.debian.org/debian sid/main Packages [30.4MiB]
1% [9 sid/main Packages 0B/30.4MiB 0%]| 452B/s | ETA: 
10m16s
W: failed to download index for 'sid/main'
W: downloading 
http://ftp.us.debian.org/debian/dists/sid/main/binary-i386/Packages failed: 
HTTP response code said error: 404

So it looks like the Release is temporarily “out of sync” with other files.

Meanwhile, I have never run into this problem with apt-get.

... ah, okay, maybe this is it: ftp.us.debian.org uses round-robin DNS
to switch between multiple mirrors.  APT’s HTTP method copes with this
by doing the lookup once and reusing the IP for a number of requests,
whereas it looks like cupt is switching between mirrors too often.

Selecting a random particular mirror (like mirrors2.kernel.org, the
first one ‘ping’ gave me) does avoid the problem, though that doesn’t
rule out this having just avoided some particular problematic mirror.

Regards,
Jonathan



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-20 Thread Eugene V. Lyubimkin
Jonathan Nieder wrote:
> So it looks like the Release is temporarily “out of sync” with other files.
Yup.

> Meanwhile, I have never run into this problem with apt-get.
> 
> ... ah, okay, maybe this is it: ftp.us.debian.org uses round-robin DNS
> to switch between multiple mirrors.  APT’s HTTP method copes with this
> by doing the lookup once and reusing the IP for a number of requests,
> whereas it looks like cupt is switching between mirrors too often.
You seem to be right. Cupt just passes all network work to Curl library.

Khm. I would argue that it's not a Cupt problem and the other side providing
round-robin DNS should ensure the equality of files. I don't know is it
possibly technically to enable-disable particular IPs on the fly
technically... Need to think more probably.

> Selecting a random particular mirror (like mirrors2.kernel.org, the
> first one ‘ping’ gave me) does avoid the problem, though that doesn’t
> rule out this having just avoided some particular problematic mirror.
Hm, given a reason above, using a static mirror (not "changing" content
between calls) should avoid this problem completely, no?

-- 
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc
Description: OpenPGP digital signature


Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-20 Thread Jonathan Nieder
Eugene V. Lyubimkin wrote:

> Khm. I would argue that it's not a Cupt problem and the other side providing
> round-robin DNS should ensure the equality of files.

Eh, it might be nice for them to do that, but we would have to either
change them or live with what we have.

In other words, maybe this is not the way DNS is supposed to be used,
but it is an assumption for “APT over HTTP” used on both ends (the
mirrors and the clients).  And this is a simple assumption that would
not be broken by any DNS cache.  It is the client’s responsibility to
use the same IP where it needs consistency.

> Hm, given a reason above, using a static mirror (not "changing" content
> between calls) should avoid this problem completely, no?

Yes, I only meant that I have not empirically ruled out other causes.
But I do think we’ve found the problem.

Now to find some time to fix it. :)

Thanks,
Jonathan



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-20 Thread Eugene V. Lyubimkin
package cupt libcupt-perl mirrors
reassign 582352 mirrors
retitle 582352 different mirrors under single DNS should have equal content
affects 582352 + libcupt-perl
thanks

Jonathan Nieder wrote:
> Eugene V. Lyubimkin wrote:
> 
>> Khm. I would argue that it's not a Cupt problem and the other side providing
>> round-robin DNS should ensure the equality of files.
> 
> Eh, it might be nice for them to do that, but we would have to either
> change them or live with what we have.
> 
> In other words, maybe this is not the way DNS is supposed to be used,
> but it is an assumption for “APT over HTTP” used on both ends (the
> mirrors and the clients).  And this is a simple assumption that would
> not be broken by any DNS cache.  It is the client’s responsibility to
> use the same IP where it needs consistency.
Ideally, I disagree. Is there some RFC spec about 'use the same IP if you need
consistence?' or like?

> Now to find some time to fix it. :)
Even given all the above, I'd in favor to provide a workaround from Cupt's
side. But, in this case, I practically cannot. Curl has not options to control
DNS->IP selection, and Cupt download system is written in method- and
host-agnostic way, with multi-process approach and internal pipelining. Last
two things mean that downloads methods cannot rely on any DNS cache. Saying
that not all files can be downloaded independently (in this case, Release and
Packages, but that also applies to to download of .debs as well) breaks too
many places of the system, and "fixing" this part needs a major rewrite, less
maintainable/scalable code and hacks to avoid race condition bugs #442189. The
current download system avoids that by fully parallel design.

The possible solutions include multithreading instead of multiprocess design
(has its downsides, requires a change of implementation language) or
implemenenting a file-based DNS cache in Curl instead of memory-based
(unlikely to have, I suppose).

Summarizing: sorry, I can't provide a workaround in a near future.

Also, let's see what maintainers of 'mirrors' pseudo-package can suggest.

-- 
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc
Description: OpenPGP digital signature


Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-20 Thread Jonathan Nieder
Eugene V. Lyubimkin wrote:

> Is there some RFC spec about 'use the same IP if you need
> consistence?' or like?

I was too focused on DNS before, and there is no problem there.  Of
course DNS does exactly what we want it to here.

However, a case could be made that the inconsistency between APT
mirrors served under the same hostname breaks HTTP (and perhaps
violates it, though I haven’t found a relevant passage), since an HTTP
proxy could ask the DNS resolver for a new server to connect to at any
time.

(Just to be clear, there is no redirection involved here.)

 $ wget -S http://ftp.us.debian.org/debian/dists/unstable/Release
 --2010-05-20 19:34:35--
 http://ftp.us.debian.org/debian/dists/unstable/Release
 Resolving ftp.us.debian.org... 35.9.37.225, 64.50.236.52, 128.30.2.36,
 ...
 Connecting to ftp.us.debian.org|35.9.37.225|:80... connected.
 HTTP request sent, awaiting response... 
   HTTP/1.1 200 OK
 [...]

Should apt’s HTTP method be using IP addresses in its requests
instead?  Would this be safe, or do some mirrors use virtual hosts?

Jonathan



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-21 Thread Eugene V. Lyubimkin
Jonathan Nieder wrote:
> Should apt’s HTTP method be using IP addresses in its requests
> instead?  Would this be safe, or do some mirrors use virtual hosts?
Erm, this bug is not related to APT HTTP method. You probably want to discuss
this matter with APT maintainers.

-- 
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc
Description: OpenPGP digital signature


Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-21 Thread Jonathan Nieder
Hi APT team,

As Eugene noticed, the use of round-robin DNS between out-of-sync
mirrors, by ftp.us.debian.org for example, makes it hard to reliably
fetch and verify the Debian archive's index files. Despite having
similar addresses Release.gpg, Release, and Packages can end up being
fetched from different mirrors. I suspect it is possible for this to
come up in some proxy setups, too, where the client has no control
over which mirror each file is fetched from.

I suggested that one possible solution would be to force use of IP
addresses for host names in requests made by the APT HTTP method. Of
course this is not ideal, because among other things it breaks virtual
hosts.

Eugene V. Lyubimkin wrote:

> Erm, this bug is not related to APT HTTP method.

This is about the protocol used by apt and other front-ends to
retrieve packages over HTTP, no?

Is your point that the same problem applies to other protocols like
FTP, too? In that case, I would disagree. With FTP, unlike HTTP, it is
easy to arrange for the Release.gpg, Release, and Packages files to be
obtained from a single mirror.

> You probably want to discuss
> this matter with APT maintainers.

Good idea, thanks. CC-ed.

Thoughts?
Jonathan



-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org



Bug#582352: cupt: downloading Packages.bz2 failed: invalid size

2010-05-21 Thread Eugene V. Lyubimkin
>> Erm, this bug is not related to APT HTTP method.
> 
> This is about the protocol used by apt and other front-ends to
> retrieve packages over HTTP, no?
> 
> Is your point that the same problem applies to other protocols like
> FTP, too?
No-no, I just didn't realize you want discuss this question protocol-wise.

-- 
Eugene V. Lyubimkin aka JackYF, JID: jackyf.devel(maildog)gmail.com
C++/Perl developer, Debian Developer



signature.asc
Description: OpenPGP digital signature