Re: rsync in apt sources.list?

2003-06-22 Thread Dan Jacobson
 But why at the end of http://home.tiscali.cz:8080/~cz210552/aptrsync.html :
 # Get anything we missed due to failed rsync's.  [EMAIL PROTECTED] 24 Mar 
 2002.
 os.system('apt-get update')

well, it seems for me this just starts apt-get getting everything all
over again, http_proxy or not.  So how does apt-get know if a
/var/lib/apt/lists/*Packages file is up to date or not?  It might be
unaware that we have changed a Packages file underneath its nose
because it only knows about changes that it itself did?




Re: rsync in apt sources.list?

2003-06-20 Thread Dan Jacobson
 Doing apt-get update just seems to start downloading the Packages.gz
 even though we just rsynced Packages.
Tim It could easily be a bug.
Radim It writes HIT! message there and skip this file, because it is
Radim up-to-date by rsync.
Next time I will try with http_proxy unset, because I recall with
wwwoffle, a HTTP HEAD causes a GET or something.
Tim I use apt-proxy now and am happy.
I will investigate if modem user me can use that to relieve the
apt-get update burden needed (to e.g. just upgrade a few KB packages).




Re: rsync in apt sources.list?

2003-06-19 Thread Dan Jacobson
It seems the simplest solution is to just use
http://home.tiscali.cz:8080/~cz210552/aptrsync.html
But why does he do at the bottom

# Get anything we missed due to failed rsync's.  [EMAIL PROTECTED] 24 Mar 2002.
os.system('apt-get update')
# Used to have a call to apt-cache gencaches here, but I think that's
# redundant with the apt-get update above. [EMAIL PROTECTED] 24 Mar 2002.

Doing apt-get update just seems to start downloading the Packages.gz
even though we just rsynced Packages.  Is apt supposed to detect
Packages are rater fresh and not download? It just downloaded over
again for me.

And of course commenting out apt-get update means that if some of the
servers in sources.list don't run rsync, then they won't be hit.




Re: rsync in apt sources.list?

2003-06-19 Thread Tim Freeman
From: Dan Jacobson [EMAIL PROTECTED]
It seems the simplest solution is to just use
http://home.tiscali.cz:8080/~cz210552/aptrsync.html
But why does he do at the bottom

# Get anything we missed due to failed rsync's.  [EMAIL PROTECTED] 24 Mar 2002.
os.system('apt-get update')
# Used to have a call to apt-cache gencaches here, but I think that's
# redundant with the apt-get update above. [EMAIL PROTECTED] 24 Mar 2002.

Doing apt-get update just seems to start downloading the Packages.gz
even though we just rsynced Packages.  Is apt supposed to detect
Packages are rater fresh and not download? It just downloaded over
  ^ I can't quite guess your meaning here.
again for me.

It could easily be a bug.  The rsync servers I was hitting randomly
rejected connections, so I didn't reliably get current Packages files
unless I did the apt-get update.  I couldn't easily test that the
script actually improved performance, because I didn't control the
servers I was hitting and I didn't set up my own server to test
against.  I didn't carefully monitor what the apt-get update was
really doing.

The entire issue is moot, IMO, since the person running the server
said it couldn't support people routinely doing rsync against it
because rsync's compression used too much CPU time.  Unless rsync now
supports precomputing the compression, and the servers are configured
to do that, using aptrsync is not being a good citizen.

I use apt-proxy now and am happy.  From apt-proxy's manual entry I
suppose it often uses rsync to do its work, but now I have three
machines using my cache, so I figure the decreased load ought to
compensate for the increased CPU from using rsync.

-- 
Tim Freeman  [EMAIL PROTECTED]
GPG public key fingerprint ECDF 46F8 3B80 BB9E 575D  7180 76DF FE00 34B1 5C78 

P. S. Here's my summary of the previous discussion about why aptrsync
was not a good idea.  

Date: Mon,  8 Apr 2002 19:27:59 -0700
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED], [EMAIL PROTECTED], [EMAIL PROTECTED]
Subject: Re: Bug#128818: apt-rsync is great
From: [EMAIL PROTECTED] (Tim Freeman)

From: Jason Gunthorpe [EMAIL PROTECTED]
I think the folks on -devel have gone over it enough, 

For the benefit of other readers, the conversation you're talking
about probably starts at:

   http://lists.debian.org/deity/1999/deity-199910/msg2.html

and continues on the rsync list at:

   http://lists.samba.org/pipermail/rsync/1999-October/001403.html

Quite simply, rsync uses tremendous amounts of disk IO, it reads the
entire file on the server side and does lots of math on it, http on the
other hand is intrinsicly rate limited by the requester.

I see.  rsync has a batch mode which might resolve some of the
issues.  It's described as experimental in the man entry for version
2.5.4-1 of the rsync package, which is the one in testing right now,
so maybe it makes sense not to depend on it yet.  The discussion cited
above is about 2.5 years old and doesn't mention batch mode at all,
perhaps because rsync's batch mode didn't exist then.

The as-yet-nonexistent compressor that is rsync-friendly would be
required to make a good solution for the whole problem.

However, I'm satisfied that building a version of apt-rsync that is
friendly to the server is blocked on the development of this other
software, so it's time to set this issue aside.

-- 
Tim Freeman   
[EMAIL PROTECTED]




Re: rsync in apt sources.list?

2003-06-17 Thread Corrin Lakeland
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

On Tuesday 17 June 2003 17:35, you wrote:

 Is there a link to just the changelog so I can see if they added this
 before I spend the modem time downloading? http://packages.debian.org
 doesn't seem to have links to changelogs, not are they in separate
 files on the mirrors.

Dunno, if it isn't available then it sounds like a good thing for a feature 
request.  Though at modem speeds, 12MB only takes one hour.

 Cor There are technical solutions to precomputing the diffs used by
 Cor rsync, as well as solutions for diffing .gz files.  E.g. I was
 Cor able to perform apt-get update, upgrade in about half the time,
 Cor but nobody has put in the work required to get these nicely
 Cor integrated into the current tool set,

 is it in apt now?

No, I manually executed it.

 Cor set up on the servers with simple HOWTOs for mirrors.

 or you mean apt can do it, but not all the servers run an rsync
 daemon?

No, and no.

On machine a) I copiled /var/cache/apt/archives 
I then repackaged every archive on machine a) 
(dpkg-split -s, rgzip, dpkg-split -j)
For Packages.gz, gunzip Packages.gz, rgzip Packages
One week later I ran apt-get update, apt-get upgrade on machine b)
I then repackaged all the archives on machine b)
I also renamed them so the names were the same as on machine a)
Next I did rproxy (rsync via http) from b) to a)

I timed this with both the repackaged and non repackaged versions. The 
repackaged version used much less network traffic (though not as much less as 
I would have liked).

So you see, it was a proof of concept, not a reimplementation that users can 
use.

In order to make it usable:

get the rgzip package into debian.  This patch is 100% stable (read it, it is 
short).  The only problem with it is it can only be used in 99% of 
situations, so to avoid unexpected bugs it probably should be off by default 
(e.g. it cannot be used in creating the dictionaries used by dict).  It can 
always be used in  creating .deb files.  When I proposed this to the gzip 
maintainer (Bdale Garbee IIRC) the only response I got is that it might go in 
upstream.  Well, Jean Loup is rather busy, and hasn't put it in upstream in 
the last year, so I think we shouldn't wait.

Modify dpkg to use rgzip (currently dpkg uses zlib if it is installed and 
calls gzip if it isn't) so by deleting the use of zlib this is trivial.

Write a program that precomputes the rolling checksums used by rproxy, and tie 
it into debuild.  These checksums can then be uploaded with dupload.  This 
seems technically feasable to me, and the whole system works without this 
file as well.

Modify rproxy to use the precomputed checksums if present rather than 
generating them.

Modify rproxy to cope with our version numbering so it knows two files are 
different versions if their version number differs.  I expect this is quite 
easy.

Do some stress/security checks on rproxy, mirrors won't appreciate us giving 
them security holes.  I don't know the code, so I don't know how hard this 
will be.

Modify the standard mirror script to also copy the rolling checksum files.

Convince some mirrors to install rproxy.

 What about http://home.tiscali.cz:8080/~cz210552/aptrsync.html
 is it now obsolete?

That looks like it does what you were looking for above (update via rsync,  
upgrade via http).

What do you think?
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.2.1 (GNU/Linux)

iD8DBQE+7ty0i5A0ZsG8x8cRAu3tAJoD3KoxCL6/Fyh9w4yjZeZ89ZtORACdGzO1
ZwrHUSf2eXVFkfhfeHUJE2s=
=u+Tr
-END PGP SIGNATURE-




Re: rsync in apt sources.list?

2003-06-17 Thread Colin Watson
On Tue, Jun 17, 2003 at 09:17:37PM +1200, Corrin Lakeland wrote:
 get the rgzip package into debian.  This patch is 100% stable (read
 it, it is short).  The only problem with it is it can only be used in
 99% of situations, so to avoid unexpected bugs it probably should be
 off by default (e.g. it cannot be used in creating the dictionaries
 used by dict).  It can always be used in  creating .deb files.  When I
 proposed this to the gzip maintainer (Bdale Garbee IIRC) the only
 response I got is that it might go in upstream.  Well, Jean Loup is
 rather busy, and hasn't put it in upstream in the last year, so I
 think we shouldn't wait.

gzip (1.3.5-4) unstable; urgency=low

  * merge patch from Rusty Russell that adds --rsyncable option to gzip.
This modifies the output stream to allow rsync to transfer updated .gz
files much more effectively.  The resulting .gz files should be compatible
with the existing gunzip.  The plan is that if this works out well for
Debian, the functionality will be included in a future upstream gzip
release.  Closes: #116183, #118118, #134741

 -- Bdale Garbee [EMAIL PROTECTED]  Thu, 13 Feb 2003 23:50:23 -0700

-- 
Colin Watson  [EMAIL PROTECTED]