Re: Debdelta and Streaming Package Installation for dpkg/APT

2011-05-06 Thread Guillem Jover
Hi!

On Mon, 2011-04-25 at 09:47:47 +0200, A Mennucc wrote:
 Il 25/04/2011 07:28, Guillem Jover ha scritto:
   But my main problem right now is that I didn't find clear
  documentation of the “debdelta” file format, the closest thing that I
  found was the debpatch [1] example file in the debdelta package.

 (unfortunately  the problem is that, from 2005 to ~two months ago, I
 was the only one working on debdelta.. so the documentation is mostly
 in my head...)

Sure. I think just documenting the file format, and nothing else, would
be enough to get a clear idea of how all this works, and at least for
me how to try to integrate it, and where. It would not need to be too
exhaustive though.

  Something that I found undesirable is that it seems to require executing
  a shell script included in the debdelta package to regenerate the data?

 yes. Why would that be undesirable? Shell scripts are simple, well
 known, and very powerful. (Indeed they are often used also for
 postinst etc etc in deb packages...). Using shell scripts for
 patches was a design decision in 2005 that I am quite happy of: it
 enabled me to ameliorate the patches a lot in the following years,
 without ever changing the delta internal format.

While I can understand that it makes changing the format easier, and
it was probably the right tool for fast prototyping, it also implies
the file cannot be automatically inspected/verified/handled w/o
implicitly trusting it. Which I don't think it's a good property for
a file format.

The case you mention about maintainer scripts is not equivalent, as
they are only run on installation/etc, and not on say dpkg-deb
extraction (say -I, -c, -f, etc), or when joining split packages
with dpkg-split.

   Anyway I don't quite like the idea, it would imply
  offloading some of the dpkg unpacking logic out of dpkg, or just
  duplicating it inside dpkg itself to deal with unpacking from tar
  and from the file system itself and to rely safely on the file metadata
  from the binary package.

 yes that second one would be my idea. Well, in all ways we think of it
 , since there is an overlap we may want to eliminate, then either we
 bring some logic of debdelta into dpkg, or some logic of dpkg into
 debdelta...

So, my first instinct would be to bring that logic into dpkg (or a new
dpkg-debdelta or whatever), instead of offloading it, but as mentioned
earlier I don't think I've enough understanding of how debdelta works
internally to know how the integration could happen, or if that would
be the best way to do it.

thanks,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110506085932.ga1...@gaara.hadrons.org



Re: Debdelta and Streaming Package Installation for dpkg/APT

2011-04-25 Thread A Mennucc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

hi again,

Il 25/04/2011 07:28, Guillem Jover ha scritto:
 The data.tar does not need to be recompressed for dpkg to be able to
 install it.
(that is true)
  But my main problem right now is that I didn't find clear
 documentation of the “debdelta” file format, the closest thing that I
 found was the debpatch [1] example file in the debdelta package.
(unfortunately  the problem is that, from 2005 to ~two months ago, I
was the only one working on debdelta.. so the documentation is mostly
in my head...)
 Something that I found undesirable is that it seems to require executing
 a shell script included in the debdelta package to regenerate the data?
yes. Why would that be undesirable? Shell scripts are simple, well
known, and very powerful. (Indeed they are often used also for
postinst etc etc in deb packages...). Using shell scripts for
patches was a design decision in 2005 that I am quite happy of: it
enabled me to ameliorate the patches a lot in the following years,
without ever changing the delta internal format.

 It is then reasonable to collapse this two parts, and this would
 possibly speed up the upgrade a bit.

 Yeah, AFAIUI the debdelta side seems to be similar in nature to
 dpkg-split (specifically the --auto command), so I think it might make
 more sense to integreate that part into dpkg instead of apt. At least
 the retrieving part would still need to be integrated into apt though.

don't know about this; I had a different idea in mind; I will investigate.
 Here is my idea. When  'debdelta-upgrade' is called in upgrading a
 package 'foobar' it currently creates 'foobar_2.deb'. By an
 appropriate cmdline switch, instead of creating a 'foobar_2.deb' , it
 would directly save all of its file to the filesystem, and it would
 add an extension to all the file names, making sure that no file name
 would conflict (=overwrite) with a preexisting file on the filesystem
 ; then it would create a file 'foobar_2.deb_unpacked' , that would be
 just a text file similar to the usual control file, but specifying
 also the extension used, and possibly the list of unpacked files.

 Well, doesn't it have to verify the reconstructed package matches the
 expected one?
yes; I would make sure that debdelta does that (in an internal pipe).
  Anyway I don't quite like the idea, it would imply
 offloading some of the dpkg unpacking logic out of dpkg, or just
 duplicating it inside dpkg itself to deal with unpacking from tar
 and from the file system itself and to rely safely on the file metadata
 from the binary package.
yes that second one would be my idea. Well, in all ways we think of it
, since there is an overlap we may want to eliminate, then either we
bring some logic of debdelta into dpkg, or some logic of dpkg into
debdelta...
 The control files would also need to be
 preserved somewhere, etc.
yes.

a.

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk21JyMACgkQ9B/tjjP8QKSyBQCdHoUYPxuzG1GF6bVbtkV0Bvwa
dLoAoIthGSG/SGPdQUrhnk57ovUjROa3
=9U6T
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4db52723.9050...@debian.org



Re: Debdelta and Streaming Package Installation for dpkg/APT

2011-04-24 Thread A Mennucc
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Dear Ishan, and also in CC Michael and debian-dpkg,

here is a reply to a message you sent me long ago; this message also
referenced a message in debian-dpkg [1], so I am CC-ing them as well.

The message, in short, was:

Il 06/01/2011 14:55, Ishan Jayawardena ha scritto:
 I came across the idea of streaming package installation[2] for
 dpkg/APT in Debian's wiki page for last year's summer of code. I
 found it interesting. I wrote to the debian-dpkg list about my
 interest and got two replies from them. In the replies, there was a
 mention about exploring the possibilities of using debdeltas in the
 installation process to make it faster.

Yes, there is a way (and it is actually not very difficult, at least
on the 'debdelta' side).

Let me summarize. When 'debdelta-upgrade' (or 'debpatch') recreates a
deb, one step is reassembling the data.tar part inside it; this part
moreover is compressed (gzip, bzip2 or lately lzma). This
'reassembling and compressing' takes time (both for CPU and for HD),
and is moreover quite useless, since, in short time, 'apt' will call
'dpkg -i' that  decompresses and reopens the data.tar in the deb.

It is then reasonable to collapse this two parts, and this would
possibly speed up the upgrade a bit.

Here is my idea. When  'debdelta-upgrade' is called in upgrading a
package 'foobar' it currently creates 'foobar_2.deb'. By an
appropriate cmdline switch, instead of creating a 'foobar_2.deb' , it
would directly save all of its file to the filesystem, and it would
add an extension to all the file names, making sure that no file name
would conflict (=overwrite) with a preexisting file on the filesystem
; then it would create a file 'foobar_2.deb_unpacked' , that would be
just a text file similar to the usual control file, but specifying
also the extension used, and possibly the list of unpacked files.

I could change debdelta to accomplish that, it would not be a huge change.

Someone should help instead in changing 'dpkg' so that it would be
able to install starting from 'foobar_2.deb_unpacked'. And change APT
so that it would interact with 'debdelta' to create the
'foobar_2.deb_unpacked' files, and pass them to dpkg .

Note that the above idea overlaps a lot with [2].

a.

ps: for sake of brevity and clarity, I am skipping a lot of details: I
preferred to give the whole picture first.

Links:
[1] http://lists.debian.org/debian-dpkg/2011/01/msg8.html
[2] http://wiki.debian.org/SummerOfCode2010/StreamingPackageInstall

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk20mbkACgkQ9B/tjjP8QKQoXQCcDCIGjPJzonYvQiTo9sLgg3Qo
1xMAniKKvv9rcZlOVNlm1CQBPuQ+p/Ge
=ei/T
-END PGP SIGNATURE-


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/4db499b9.6090...@debian.org



Re: Debdelta and Streaming Package Installation for dpkg/APT

2011-04-24 Thread Guillem Jover
Hi!

On Sun, 2011-04-24 at 23:44:25 +0200, A Mennucc wrote:
 here is a reply to a message you sent me long ago; this message also
 referenced a message in debian-dpkg [1], so I am CC-ing them as well.

Ah perfect, had in mind talking about the apt/debdelta integration
GSoC proposal [0] with you guys, but have not had the time to check
several details, anyway here it is, still with some facts not checked.

  [0] http://wiki.debian.org/SummerOfCode2011/AptDebdeltaIntegration

 Let me summarize. When 'debdelta-upgrade' (or 'debpatch') recreates a
 deb, one step is reassembling the data.tar part inside it; this part
 moreover is compressed (gzip, bzip2 or lately lzma). This
 'reassembling and compressing' takes time (both for CPU and for HD),
 and is moreover quite useless, since, in short time, 'apt' will call
 'dpkg -i' that  decompresses and reopens the data.tar in the deb.

The data.tar does not need to be recompressed for dpkg to be able to
install it. But my main problem right now is that I didn't find clear
documentation of the “debdelta” file format, the closest thing that I
found was the debpatch [1] example file in the debdelta package.
Something that I found undesirable is that it seems to require executing
a shell script included in the debdelta package to regenerate the data?

  [1] /usr/share/debdelta/debpatch.sh

 It is then reasonable to collapse this two parts, and this would
 possibly speed up the upgrade a bit.

Yeah, AFAIUI the debdelta side seems to be similar in nature to
dpkg-split (specifically the --auto command), so I think it might make
more sense to integreate that part into dpkg instead of apt. At least
the retrieving part would still need to be integrated into apt though.

 Here is my idea. When  'debdelta-upgrade' is called in upgrading a
 package 'foobar' it currently creates 'foobar_2.deb'. By an
 appropriate cmdline switch, instead of creating a 'foobar_2.deb' , it
 would directly save all of its file to the filesystem, and it would
 add an extension to all the file names, making sure that no file name
 would conflict (=overwrite) with a preexisting file on the filesystem
 ; then it would create a file 'foobar_2.deb_unpacked' , that would be
 just a text file similar to the usual control file, but specifying
 also the extension used, and possibly the list of unpacked files.

Well, doesn't it have to verify the reconstructed package matches the
expected one? Anyway I don't quite like the idea, it would imply
offloading some of the dpkg unpacking logic out of dpkg, or just
duplicating it inside dpkg itself to deal with unpacking from tar
and from the file system itself and to rely safely on the file metadata
from the binary package.The control files would also need to be
preserved somewhere, etc.

thanks,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110425052856.ga15...@gaara.hadrons.org



Re: Streaming Package Installation for dpkg/APT

2011-01-05 Thread Lars Wirzenius
On ke, 2011-01-05 at 07:01 +0100, Guillem Jover wrote:
 Something which I guess would speed up the installation process could
 be to just make apt download the packages in self-contained batches,
 which can be unpacked/configured independently. This would also not
 really need any change in dpkg AFAICS. This way the installation
 process could start sooner than having to wait for the whole thing to
 get downloaded. It does not remove the need to store those batched
 packages on disk, but still.

I can't look up the URLs for this, but when I worked for Canonical we
discussed something like this at one UDS, and there should be blueprints
and wiki pages on the Ubuntu sites for this. Some searching should turn
them up.

From memory, what we came up with was basically what Guillem hints at:

* apt will order its downloads in installation order
* whenever apt has a self-contained batch, it will feed them to dpkg
* while dpkg runs, apt will continue to download things in the
background

Further, we discussed the possibility of doing some of the dpkg
installation phases in parallel, even while waiting for the rest of a
batch to be downloaded: for example, unpacking might be possible already
at that time. This is more error prone and more complicated, though.

Related to these discussions we also discussed the possibility of
speeding up downloads by using debdeltas. debdelta seems to work quite
well, and it might be a good idea for Debian to adopt it officially.

-- 
Blog/wiki/website hosting with ikiwiki (free for free software):
http://www.branchable.com/


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/1294267575.2953.42.ca...@havelock.lan



Re: Streaming Package Installation for dpkg/APT

2011-01-05 Thread Ishan Jayawardena
Hi,

Guillem and Lars, thank you very much for your detailed replies. They
are really informative and encouraging.

From your replies what I learned was that, the idea is quite complex
and instead of streaming, we can think of other ways of speeding up
the installation process, which has taken my idea to a new direction.
As you have pointed out, I will also look into the possibilities of
using self-contained batches and debdeltas with apt/dpkg in the
speeding up process.

But still, the streaming idea also looks interesting to me. I will
study apt/dpkg functionality and streaming technologies in more detail
and try to come up with a suggestion on it. What I feel is, we can use
a tree like structure to resolve the dependencies and use it in the
streaming process in some way to achieve our goal. I am not sure if
it's doable or not. But I would like to develop the idea on top of
this. I will let you know if I could come up with something
interesting.

I really appreciate your help.

Thank you.

On 1/6/11, Lars Wirzenius l...@liw.fi wrote:
 On ke, 2011-01-05 at 07:01 +0100, Guillem Jover wrote:
 Something which I guess would speed up the installation process could
 be to just make apt download the packages in self-contained batches,
 which can be unpacked/configured independently. This would also not
 really need any change in dpkg AFAICS. This way the installation
 process could start sooner than having to wait for the whole thing to
 get downloaded. It does not remove the need to store those batched
 packages on disk, but still.

 I can't look up the URLs for this, but when I worked for Canonical we
 discussed something like this at one UDS, and there should be blueprints
 and wiki pages on the Ubuntu sites for this. Some searching should turn
 them up.

 From memory, what we came up with was basically what Guillem hints at:

 * apt will order its downloads in installation order
 * whenever apt has a self-contained batch, it will feed them to dpkg
 * while dpkg runs, apt will continue to download things in the
 background

 Further, we discussed the possibility of doing some of the dpkg
 installation phases in parallel, even while waiting for the rest of a
 batch to be downloaded: for example, unpacking might be possible already
 at that time. This is more error prone and more complicated, though.

 Related to these discussions we also discussed the possibility of
 speeding up downloads by using debdeltas. debdelta seems to work quite
 well, and it might be a good idea for Debian to adopt it officially.

 --
 Blog/wiki/website hosting with ikiwiki (free for free software):
 http://www.branchable.com/




-- 
Regards,
Ishan Jayawardena.


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/aanlktimt5fmfp0cvh_6jsnhyerd=5vwu1dda3ouv2...@mail.gmail.com



Streaming Package Installation for dpkg/APT

2011-01-04 Thread Ishan Jayawardena
Hi,

I would like to know about the streaming package installation for
dpkg/APT. I read about this from last year's summer of code ideas list
of Debian [1], and found it interesting. I also found that it had not
been taken by any of the applicants, and, therefore, I would like to
work on it this summer.

Is there any ongoing development related to that idea? There is a
description given in [1] and apart from that, are there any concerns
of it? I would like to know your ideas and suggestions about it, to
proceed. Please let me know if you have something to share with me,
I'm looking forward to your feedback.


[1] http://wiki.debian.org/SummerOfCode2010/StreamingPackageInstall

Thank you.
-- 
Regards,
Ishan Jayawardena.


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: 
http://lists.debian.org/aanlktikew9zujplwmdamxlje6690_pgmkxt9khs4w...@mail.gmail.com



Re: Streaming Package Installation for dpkg/APT

2011-01-04 Thread Guillem Jover
Hi!

On Wed, 2011-01-05 at 09:55:29 +0530, Ishan Jayawardena wrote:
 I would like to know about the streaming package installation for
 dpkg/APT. I read about this from last year's summer of code ideas list
 of Debian [1], and found it interesting. I also found that it had not
 been taken by any of the applicants, and, therefore, I would like to
 work on it this summer.

You might want to talk with the people involved in that proposal, as I
don't think/remeber it ever being discussed on this list. CCed them now.
CCed Lars too which AFAIR mentioned something like this to me at some
point?

 Is there any ongoing development related to that idea?

I don't know of any.

 There is a description given in [1] and apart from that, are there
 any concerns of it? I would like to know your ideas and suggestions
 about it, to proceed. Please let me know if you have something to
 share with me, I'm looking forward to your feedback.

Michael and Simon might be able to fill the blanks.

About concerns, the one that comes to mind immediately is that dpkg
treats the packages as the basic units of operation, when invoked it
first parses the control files for all provided packages, and then
operates on them, reordering if needed, bailing out if dependencies
cannot be satisfied, breaking cycles, etc. If the packages are not on
disk, and they are streamed to dpkg, then it might not be able to
operate properly. Which might not be an unsurmountable issue, but then
I've not thought this through too much...

Something which I guess would speed up the installation process could
be to just make apt download the packages in self-contained batches,
which can be unpacked/configured independently. This would also not
really need any change in dpkg AFAICS. This way the installation
process could start sooner than having to wait for the whole thing to
get downloaded. It does not remove the need to store those batched
packages on disk, but still.

 [1] http://wiki.debian.org/SummerOfCode2010/StreamingPackageInstall

thanks,
guillem


-- 
To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org
with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20110105060107.ga...@gaara.hadrons.org