Re: Debdelta and Streaming Package Installation for dpkg/APT
Hi! On Mon, 2011-04-25 at 09:47:47 +0200, A Mennucc wrote: Il 25/04/2011 07:28, Guillem Jover ha scritto: But my main problem right now is that I didn't find clear documentation of the “debdelta” file format, the closest thing that I found was the debpatch [1] example file in the debdelta package. (unfortunately the problem is that, from 2005 to ~two months ago, I was the only one working on debdelta.. so the documentation is mostly in my head...) Sure. I think just documenting the file format, and nothing else, would be enough to get a clear idea of how all this works, and at least for me how to try to integrate it, and where. It would not need to be too exhaustive though. Something that I found undesirable is that it seems to require executing a shell script included in the debdelta package to regenerate the data? yes. Why would that be undesirable? Shell scripts are simple, well known, and very powerful. (Indeed they are often used also for postinst etc etc in deb packages...). Using shell scripts for patches was a design decision in 2005 that I am quite happy of: it enabled me to ameliorate the patches a lot in the following years, without ever changing the delta internal format. While I can understand that it makes changing the format easier, and it was probably the right tool for fast prototyping, it also implies the file cannot be automatically inspected/verified/handled w/o implicitly trusting it. Which I don't think it's a good property for a file format. The case you mention about maintainer scripts is not equivalent, as they are only run on installation/etc, and not on say dpkg-deb extraction (say -I, -c, -f, etc), or when joining split packages with dpkg-split. Anyway I don't quite like the idea, it would imply offloading some of the dpkg unpacking logic out of dpkg, or just duplicating it inside dpkg itself to deal with unpacking from tar and from the file system itself and to rely safely on the file metadata from the binary package. yes that second one would be my idea. Well, in all ways we think of it , since there is an overlap we may want to eliminate, then either we bring some logic of debdelta into dpkg, or some logic of dpkg into debdelta... So, my first instinct would be to bring that logic into dpkg (or a new dpkg-debdelta or whatever), instead of offloading it, but as mentioned earlier I don't think I've enough understanding of how debdelta works internally to know how the integration could happen, or if that would be the best way to do it. thanks, guillem -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110506085932.ga1...@gaara.hadrons.org
Re: Debdelta and Streaming Package Installation for dpkg/APT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 hi again, Il 25/04/2011 07:28, Guillem Jover ha scritto: The data.tar does not need to be recompressed for dpkg to be able to install it. (that is true) But my main problem right now is that I didn't find clear documentation of the “debdelta” file format, the closest thing that I found was the debpatch [1] example file in the debdelta package. (unfortunately the problem is that, from 2005 to ~two months ago, I was the only one working on debdelta.. so the documentation is mostly in my head...) Something that I found undesirable is that it seems to require executing a shell script included in the debdelta package to regenerate the data? yes. Why would that be undesirable? Shell scripts are simple, well known, and very powerful. (Indeed they are often used also for postinst etc etc in deb packages...). Using shell scripts for patches was a design decision in 2005 that I am quite happy of: it enabled me to ameliorate the patches a lot in the following years, without ever changing the delta internal format. It is then reasonable to collapse this two parts, and this would possibly speed up the upgrade a bit. Yeah, AFAIUI the debdelta side seems to be similar in nature to dpkg-split (specifically the --auto command), so I think it might make more sense to integreate that part into dpkg instead of apt. At least the retrieving part would still need to be integrated into apt though. don't know about this; I had a different idea in mind; I will investigate. Here is my idea. When 'debdelta-upgrade' is called in upgrading a package 'foobar' it currently creates 'foobar_2.deb'. By an appropriate cmdline switch, instead of creating a 'foobar_2.deb' , it would directly save all of its file to the filesystem, and it would add an extension to all the file names, making sure that no file name would conflict (=overwrite) with a preexisting file on the filesystem ; then it would create a file 'foobar_2.deb_unpacked' , that would be just a text file similar to the usual control file, but specifying also the extension used, and possibly the list of unpacked files. Well, doesn't it have to verify the reconstructed package matches the expected one? yes; I would make sure that debdelta does that (in an internal pipe). Anyway I don't quite like the idea, it would imply offloading some of the dpkg unpacking logic out of dpkg, or just duplicating it inside dpkg itself to deal with unpacking from tar and from the file system itself and to rely safely on the file metadata from the binary package. yes that second one would be my idea. Well, in all ways we think of it , since there is an overlap we may want to eliminate, then either we bring some logic of debdelta into dpkg, or some logic of dpkg into debdelta... The control files would also need to be preserved somewhere, etc. yes. a. -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk21JyMACgkQ9B/tjjP8QKSyBQCdHoUYPxuzG1GF6bVbtkV0Bvwa dLoAoIthGSG/SGPdQUrhnk57ovUjROa3 =9U6T -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4db52723.9050...@debian.org
Re: Debdelta and Streaming Package Installation for dpkg/APT
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear Ishan, and also in CC Michael and debian-dpkg, here is a reply to a message you sent me long ago; this message also referenced a message in debian-dpkg [1], so I am CC-ing them as well. The message, in short, was: Il 06/01/2011 14:55, Ishan Jayawardena ha scritto: I came across the idea of streaming package installation[2] for dpkg/APT in Debian's wiki page for last year's summer of code. I found it interesting. I wrote to the debian-dpkg list about my interest and got two replies from them. In the replies, there was a mention about exploring the possibilities of using debdeltas in the installation process to make it faster. Yes, there is a way (and it is actually not very difficult, at least on the 'debdelta' side). Let me summarize. When 'debdelta-upgrade' (or 'debpatch') recreates a deb, one step is reassembling the data.tar part inside it; this part moreover is compressed (gzip, bzip2 or lately lzma). This 'reassembling and compressing' takes time (both for CPU and for HD), and is moreover quite useless, since, in short time, 'apt' will call 'dpkg -i' that decompresses and reopens the data.tar in the deb. It is then reasonable to collapse this two parts, and this would possibly speed up the upgrade a bit. Here is my idea. When 'debdelta-upgrade' is called in upgrading a package 'foobar' it currently creates 'foobar_2.deb'. By an appropriate cmdline switch, instead of creating a 'foobar_2.deb' , it would directly save all of its file to the filesystem, and it would add an extension to all the file names, making sure that no file name would conflict (=overwrite) with a preexisting file on the filesystem ; then it would create a file 'foobar_2.deb_unpacked' , that would be just a text file similar to the usual control file, but specifying also the extension used, and possibly the list of unpacked files. I could change debdelta to accomplish that, it would not be a huge change. Someone should help instead in changing 'dpkg' so that it would be able to install starting from 'foobar_2.deb_unpacked'. And change APT so that it would interact with 'debdelta' to create the 'foobar_2.deb_unpacked' files, and pass them to dpkg . Note that the above idea overlaps a lot with [2]. a. ps: for sake of brevity and clarity, I am skipping a lot of details: I preferred to give the whole picture first. Links: [1] http://lists.debian.org/debian-dpkg/2011/01/msg8.html [2] http://wiki.debian.org/SummerOfCode2010/StreamingPackageInstall -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.10 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAk20mbkACgkQ9B/tjjP8QKQoXQCcDCIGjPJzonYvQiTo9sLgg3Qo 1xMAniKKvv9rcZlOVNlm1CQBPuQ+p/Ge =ei/T -END PGP SIGNATURE- -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4db499b9.6090...@debian.org
Re: Debdelta and Streaming Package Installation for dpkg/APT
Hi! On Sun, 2011-04-24 at 23:44:25 +0200, A Mennucc wrote: here is a reply to a message you sent me long ago; this message also referenced a message in debian-dpkg [1], so I am CC-ing them as well. Ah perfect, had in mind talking about the apt/debdelta integration GSoC proposal [0] with you guys, but have not had the time to check several details, anyway here it is, still with some facts not checked. [0] http://wiki.debian.org/SummerOfCode2011/AptDebdeltaIntegration Let me summarize. When 'debdelta-upgrade' (or 'debpatch') recreates a deb, one step is reassembling the data.tar part inside it; this part moreover is compressed (gzip, bzip2 or lately lzma). This 'reassembling and compressing' takes time (both for CPU and for HD), and is moreover quite useless, since, in short time, 'apt' will call 'dpkg -i' that decompresses and reopens the data.tar in the deb. The data.tar does not need to be recompressed for dpkg to be able to install it. But my main problem right now is that I didn't find clear documentation of the “debdelta” file format, the closest thing that I found was the debpatch [1] example file in the debdelta package. Something that I found undesirable is that it seems to require executing a shell script included in the debdelta package to regenerate the data? [1] /usr/share/debdelta/debpatch.sh It is then reasonable to collapse this two parts, and this would possibly speed up the upgrade a bit. Yeah, AFAIUI the debdelta side seems to be similar in nature to dpkg-split (specifically the --auto command), so I think it might make more sense to integreate that part into dpkg instead of apt. At least the retrieving part would still need to be integrated into apt though. Here is my idea. When 'debdelta-upgrade' is called in upgrading a package 'foobar' it currently creates 'foobar_2.deb'. By an appropriate cmdline switch, instead of creating a 'foobar_2.deb' , it would directly save all of its file to the filesystem, and it would add an extension to all the file names, making sure that no file name would conflict (=overwrite) with a preexisting file on the filesystem ; then it would create a file 'foobar_2.deb_unpacked' , that would be just a text file similar to the usual control file, but specifying also the extension used, and possibly the list of unpacked files. Well, doesn't it have to verify the reconstructed package matches the expected one? Anyway I don't quite like the idea, it would imply offloading some of the dpkg unpacking logic out of dpkg, or just duplicating it inside dpkg itself to deal with unpacking from tar and from the file system itself and to rely safely on the file metadata from the binary package.The control files would also need to be preserved somewhere, etc. thanks, guillem -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110425052856.ga15...@gaara.hadrons.org
Re: Streaming Package Installation for dpkg/APT
On ke, 2011-01-05 at 07:01 +0100, Guillem Jover wrote: Something which I guess would speed up the installation process could be to just make apt download the packages in self-contained batches, which can be unpacked/configured independently. This would also not really need any change in dpkg AFAICS. This way the installation process could start sooner than having to wait for the whole thing to get downloaded. It does not remove the need to store those batched packages on disk, but still. I can't look up the URLs for this, but when I worked for Canonical we discussed something like this at one UDS, and there should be blueprints and wiki pages on the Ubuntu sites for this. Some searching should turn them up. From memory, what we came up with was basically what Guillem hints at: * apt will order its downloads in installation order * whenever apt has a self-contained batch, it will feed them to dpkg * while dpkg runs, apt will continue to download things in the background Further, we discussed the possibility of doing some of the dpkg installation phases in parallel, even while waiting for the rest of a batch to be downloaded: for example, unpacking might be possible already at that time. This is more error prone and more complicated, though. Related to these discussions we also discussed the possibility of speeding up downloads by using debdeltas. debdelta seems to work quite well, and it might be a good idea for Debian to adopt it officially. -- Blog/wiki/website hosting with ikiwiki (free for free software): http://www.branchable.com/ -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1294267575.2953.42.ca...@havelock.lan
Re: Streaming Package Installation for dpkg/APT
Hi, Guillem and Lars, thank you very much for your detailed replies. They are really informative and encouraging. From your replies what I learned was that, the idea is quite complex and instead of streaming, we can think of other ways of speeding up the installation process, which has taken my idea to a new direction. As you have pointed out, I will also look into the possibilities of using self-contained batches and debdeltas with apt/dpkg in the speeding up process. But still, the streaming idea also looks interesting to me. I will study apt/dpkg functionality and streaming technologies in more detail and try to come up with a suggestion on it. What I feel is, we can use a tree like structure to resolve the dependencies and use it in the streaming process in some way to achieve our goal. I am not sure if it's doable or not. But I would like to develop the idea on top of this. I will let you know if I could come up with something interesting. I really appreciate your help. Thank you. On 1/6/11, Lars Wirzenius l...@liw.fi wrote: On ke, 2011-01-05 at 07:01 +0100, Guillem Jover wrote: Something which I guess would speed up the installation process could be to just make apt download the packages in self-contained batches, which can be unpacked/configured independently. This would also not really need any change in dpkg AFAICS. This way the installation process could start sooner than having to wait for the whole thing to get downloaded. It does not remove the need to store those batched packages on disk, but still. I can't look up the URLs for this, but when I worked for Canonical we discussed something like this at one UDS, and there should be blueprints and wiki pages on the Ubuntu sites for this. Some searching should turn them up. From memory, what we came up with was basically what Guillem hints at: * apt will order its downloads in installation order * whenever apt has a self-contained batch, it will feed them to dpkg * while dpkg runs, apt will continue to download things in the background Further, we discussed the possibility of doing some of the dpkg installation phases in parallel, even while waiting for the rest of a batch to be downloaded: for example, unpacking might be possible already at that time. This is more error prone and more complicated, though. Related to these discussions we also discussed the possibility of speeding up downloads by using debdeltas. debdelta seems to work quite well, and it might be a good idea for Debian to adopt it officially. -- Blog/wiki/website hosting with ikiwiki (free for free software): http://www.branchable.com/ -- Regards, Ishan Jayawardena. -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/aanlktimt5fmfp0cvh_6jsnhyerd=5vwu1dda3ouv2...@mail.gmail.com
Streaming Package Installation for dpkg/APT
Hi, I would like to know about the streaming package installation for dpkg/APT. I read about this from last year's summer of code ideas list of Debian [1], and found it interesting. I also found that it had not been taken by any of the applicants, and, therefore, I would like to work on it this summer. Is there any ongoing development related to that idea? There is a description given in [1] and apart from that, are there any concerns of it? I would like to know your ideas and suggestions about it, to proceed. Please let me know if you have something to share with me, I'm looking forward to your feedback. [1] http://wiki.debian.org/SummerOfCode2010/StreamingPackageInstall Thank you. -- Regards, Ishan Jayawardena. -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/aanlktikew9zujplwmdamxlje6690_pgmkxt9khs4w...@mail.gmail.com
Re: Streaming Package Installation for dpkg/APT
Hi! On Wed, 2011-01-05 at 09:55:29 +0530, Ishan Jayawardena wrote: I would like to know about the streaming package installation for dpkg/APT. I read about this from last year's summer of code ideas list of Debian [1], and found it interesting. I also found that it had not been taken by any of the applicants, and, therefore, I would like to work on it this summer. You might want to talk with the people involved in that proposal, as I don't think/remeber it ever being discussed on this list. CCed them now. CCed Lars too which AFAIR mentioned something like this to me at some point? Is there any ongoing development related to that idea? I don't know of any. There is a description given in [1] and apart from that, are there any concerns of it? I would like to know your ideas and suggestions about it, to proceed. Please let me know if you have something to share with me, I'm looking forward to your feedback. Michael and Simon might be able to fill the blanks. About concerns, the one that comes to mind immediately is that dpkg treats the packages as the basic units of operation, when invoked it first parses the control files for all provided packages, and then operates on them, reordering if needed, bailing out if dependencies cannot be satisfied, breaking cycles, etc. If the packages are not on disk, and they are streamed to dpkg, then it might not be able to operate properly. Which might not be an unsurmountable issue, but then I've not thought this through too much... Something which I guess would speed up the installation process could be to just make apt download the packages in self-contained batches, which can be unpacked/configured independently. This would also not really need any change in dpkg AFAICS. This way the installation process could start sooner than having to wait for the whole thing to get downloaded. It does not remove the need to store those batched packages on disk, but still. [1] http://wiki.debian.org/SummerOfCode2010/StreamingPackageInstall thanks, guillem -- To UNSUBSCRIBE, email to debian-dpkg-requ...@lists.debian.org with a subject of unsubscribe. Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110105060107.ga...@gaara.hadrons.org