Re: remastering ISOs to append boot options

2009-07-06 Thread Joseph Rawson
On Sunday 05 July 2009 19:42:19 Samuel Thibault wrote:
> Hello,
>
> Some a11y people asked how to very easily remaster ISOs so as to append
> parameters to the kernel command line, to e.g. setup the braille
> configuration once for good before burning a CD. I've prepared a small
> crude script to do that on
>
> http://people.debian.org/~sthibault/remaster-append.sh
>
> it depends on the bsdtar and genisoimage packages, sample use is
>
> remaster-append.sh "brltty=eu,ttyS-1,fr_FR"
> debian-testing-i386-businesscard.iso myimage.iso
>
> I'm wondering how that could be provided in debian, or whether it
> already exists somewhere and I just wasn't able to find it :) I believe
> it could be more generally useful, for people preseeding stuff & such.
>
> Samuel

I found the instructions on how to do this here:

http://wiki.debian.org/DebianInstaller/Modify/CD

I also wrote a script that modifies a netinst iso by inserting files pulled 
from a subversion repository.

The thing I like about your approach is using bsdtar, which I didn't even know 
existed.  The wiki article (and consequently, my script) use mount -o loop 
which requires root permissions.  It would be nice if the wiki article used 
bsdtar as an example, instead.

-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Progress on reprepro frontend

2009-07-03 Thread Joseph Rawson
y the configuration is being managed 
so far, which has been the most difficult part.  There is some code that 
handles running reprepro, but it hasn't really been used yet.  Only update 
and export are handled now, but it shouldn't be too difficult get this part 
going.  I have been more concerned with getting reprepro configured in a way 
that makes it easy to use as a backend with a simple frontend configuration.

I'm not really happy with the name "repserve", but I picked it out of the air, 
because I needed to start with something.  I would like to use another name, 
but I can't think of one that will work.  I'm open to suggestion here.  I'm 
also open to suggestion concerning anything that I've written above, although 
some suggestions should be accompanied by a patch or example.  I would really 
like to gather suggestions on how to name the repositories.  I think that my 
guessing function is a good start, but it could use a lot of improvement.  
When the guessing gets good enough, the function could raise a warning when a 
name couldn't be guessed, so the user can then edit the repserve.conf to fix 
the problem.

All the code is here:
svn://svn.berlios.de/paella/repserve/trunk

I would like to move the code to it's own project space, but I need to name it 
something before that happens.

I have been playing around with germinate a bit, in case we want to make a 
short list of manually selected packages, and use germinate to resolve the 
dependencies and create the filterlists.  I don't expect this part to be 
working properly anytime soon.

-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-25 Thread Joseph Rawson
On Sunday 21 June 2009 03:33:33 Goswin von Brederlow wrote:

> > The Release could be signed using an rsign method with the machine(s)
> > that manage the repository, or it could be done locally on the server
> > using gpg-agent, or an unencrypted private key, depending on how the
> > administrator prefers to manage it.
>
> The simplest implementation would be a tiny proxy applet that, when a
> deb file is requested, checks if the file is in the local
> archive. If it is then send it. If not then request file from
> upstream and pipe it to apt (no latency) and a tempfile. When the
> download has finished then reprepro --include suite deb. Doing the
> same for source is a little more tricky as you needs the dsc and
> related files as a group.
>
I don't understand the tempfile part.  Otherwise, that's a better idea, since 
my idea depended on running reprepro update, then sending the appropriate 
debs.
> >> Optional the apt proxy could prefetch package versions but for me that
> >> wouldn't be a high priority.
> >>
> >> Nice would be that it fetches sources along with binaries. When I find
> >> a bug in some software while traveling I would hate to not have the
> >> source available to fix it. But then it also needs to fetch
> >> Build-depends and their depends. So that would complicate matters a
> >> lot.
> >
> > I mentioned that part above.
> >
> >> MfG
> >> Goswin
> >
> > Overall, I think that reprepro does a good job of maintaining a local
> > repository, and we shouldn't reimplement what it does.  Reprepro also
> > seems flexible enough to implement most of the backend with simple
> > commands and options.  I've never tried to implement a new apt-method
> > before, so I think that would take a bit more research from me.
>
> I totally agree that reprepro as the cache/storage backend would be
> great use of existing software.
>
This is where I'm starting the code.  Since regardless of how the partial 
mirror(s) will be managed, we agree that using reprepro as the backend is the 
best choice,  I decided to start making a "frontend" or more 
appropriately "middle-layer" for this.  Making this part simple enough to use  
with the most likely used configuration, while keeping the option to be 
almost as flexible as reprepro is has been a quite bit of work and thought.

I have been working from the assumption that the local repository won't be a 
merged repository, but will be a set of partial mirrors.  By this I mean 
that "debian.org" doesn't have to be merged with "backports.org", 
but "sid/debian.org" may be in the same repository as "lenny/debian.org" 
(although even this could be separate, even if not recommended).  What I'm 
saying is that I'm trying to allow either separate or merged repositories to 
be used where they make the most sense.

> The problem I have with it being an apt method is that the apt method
> runs on a different host than the reprepro. That would require ssh
> logins from all participating clients or something to alter the
> reprepro filter.

I didn't stop to think about authentication, but I agree that it adds another 
level of work.  I took a bit of time to try and read up on how apt transport 
methods work, but I didn't get very far.  The only two transport methods that 
are available now are https and debtorrent.  Both of those are written in C, 
which I'm not very good at using.

I think that I'm just going to work on the basics of controlling reprepro, and 
adding/merging/removing filterlists, and when I'm satisfied that's working 
properly it'll be easier to decide how to control/manage it.  I think that it 
will be better to work in that direction first, since it will be needed 
anyway.

I have a small amount of code that I've started on.  It doesn't do anything 
yet, but create the distribution and updates files in the conf/ 
directory(ies).  I also have a bit of code to help merge filterlists, but I 
don't have any code that actually creates the lists and uses them in the 
reprepro config.  Once I figure out where to upload the code, I'll let you 
know.

-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-20 Thread Joseph Rawson
On Saturday 20 June 2009 03:16:33 Goswin von Brederlow wrote:
> Joseph Rawson  writes:
> > On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote:
> >> Or have a proxy that adds packages that are requested.
> >
> > When I woke up this morning, I was thinking that it might be interesting
> > to have an apt method that talks directly to reprepro.  It's just a vague
> > idea now, but I'll give it some more thought later.
>
> Way too much latency to mirror a deb when requested and you need to
> run apt-get update for it to show up.
>
> The best you can do is add the package to the filter list and then
> fetch it directly. Then the next night the mirror will pick it up for
> future updates.
>
What I had in mind would eliminate a large part of the latency, and also keep 
from downloading the deb twice.

Use a server application (I'll call it repserve for now) on the machine that 
hosts the reprepro repository.  

apt-get update
The apt method talks to repserve, then repserve tells reprepro to run either 
update or checkupdate, then repserve feeds the appropriate files from the 
reprepro lists/ director(y/ies) back to the apt-get process on the local 
machine.  This would probably use a bit more bandwidth (at least for the 
first update) since apt-get will download .pdiff files, where reprepro just 
grabs the whole Packages.gz files.

apt-get install, upgrade, build-dep
The apt method determines which source in it's apt lists to retrieve the 
package from, then sends that info to repserve.  Repserve looks in it's 
repositor(y/ies) to determine where those packages are (or if they aren't yet 
mirrored), probably by scanning the filter lists.  Repserve then tells 
reprepro to update in the appropriate repositories (if necessary).  Then 
repserve signals the local client (or local client polls repserve), and the 
debs are then transferred from reprepro repos to local client.  After that, 
the repserve process could instruct reprepro to retrieve the sources, if it's 
configured to do that.  Also, it could try and determine build deps for those 
packages, and retrieve them and the sources, if it's configured to do that as 
well.  With retrieving builddeps enabled, there might be a problem in having 
to explicitly list preferred alternatives, but this is mainly for packages 
that have drop-in replacements for libfoo-dev, like libgamin-dev provides 
libfam-dev.

This is still just a rough idea.  One of the interesting things about using an 
idea like this, is that it can still allow reprepro to be used in the normal 
way, so you can have a couple of machines that instruct repserve to help 
maintain the repository, and other machines on the network can just use 
reprepro directly through apache, ftp, etc.  The "controlling" machines would 
have a sources.list like:

deb repserve://myhost/debrepos/debian lenny main contrib non-free

The repserve method on the client would send that line to the repserve server.  
The server would parse the line and match it to the appropriate repository 
from its configuration.

The other hosts would just have this in sources.list:

deb http://myhost/debrepos/debian lenny main contrib non-free

The hosts using repserve could be the only ones with filter lists in reprepro, 
but it may be desired to have filter lists from the other machines, also.  
This would help keep packages from disappearing from the pool when they are 
still needed.  It may also be nice to use reprepro's snapshotting each time a 
repserve method updates a repository, although this may require using those 
snapshot urls on the hosts that aren't using repserve.


>
> But now you made me think about this too. So here is what I think:
>
> - My bandwidth at home is fast enough to fetch packages directly. No
>   need to mirror at all.
>
> - I don't want to download a package multiple times (once per host) so
>   some shared proxy would be good.
>
My idea would keep that from happening, at the expense of latency.  The 
latency would be minimal, as it would just be dependant on reprepro 
retrieving the package(s) and signalling the client that the package is 
ready.  Using reprepro to add extra packages to the repository from upstream 
without doing a full update may not be possible, but if it were, the latency 
would certainly be minimum, and the bandwidth to the internet would also be 
minimum.  I just looked at the manpage again, and this may be possible by 
using the --nolistsdownload option with the update/checkupdate command.


> - Bootstraping a chroot still benefits from local packages but a
>   shared proxy would do there too.
>
> - When I'm not at home I might not have network access or only a slow
>   one so then I need a mirror. And my parents computer has a Linux that
>   only I use and that needs a major update every time I v

Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 20:54:28 Tzafrir Cohen wrote:
> On Fri, Jun 19, 2009 at 06:23:08AM -0500, Joseph Rawson wrote:
> > On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
> > > On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
> > > > would be much more interested in making a tool that would make it
> > > > easier to manage local/partial debian mirrors (i.e. one that helped
> > > > resolve the dependencies), rather than have an apt-get wrapper.  I
> > > > also think that once such a tool is made, it would make it easier to
> > > > build an apt-get wrapper that works with it.  I don't think that
> > > > viewing the problem with an "apt-get wrapper" solution is the best
> > > > way to approach it, but I do think that it would be valuable once the
> > > > underlying problems are solved.
> > >
> > > And reprepro does not fit the bill because?
> >
> > It fits part of the bill, as it's an excellent tool for maintaining a
> > repository, but it doesn't resolve dependencies (nor should it).
>
> Just in case it might help, here's a script we used internally (at the
> Sarge time) to maintain a dummy repository that would help us eventually
> resolve an original list of packages to a complete list of packages we
> ask a reprepro source to update.
>
Did you forget to attach it? :)

> --
> Tzafrir Cohen | tzaf...@jabber.org | VIM is
> http://tzafrir.org.il || a Mutt's
> tzaf...@cohens.org.il ||  best
> ICQ# 16849754 || friend



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 12:57:25 Goswin von Brederlow wrote:
> Joseph Rawson  writes:
> > On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote:
> >> Joseph Rawson  writes:
> >> If so then you can configure a post invoke hook in apt that will copy
> >> the dpkg status file of the host to the server [as status.$(hostname)]
> >> and then use those on the server to generate the filter for
> >> reprepro. I think I still have a script for that somewhere but it is
> >> easy enough to rewrite.
> >
> > That's good for binaries, but I don't know about the source.  It wasn't
> > long ago that I noticed a problem with reprepro not obtaining the
> > corresponding source packages when you use a filter list taken
> > from  "dpkg --get-selections".  I remember that the source for jigdo
> > wasn't in my partial mirror, because there were no binaries named
> > "jigdo", rather "jigdo-file" and "jigdo-lite".  Since there were no
> > sources with that name, the jigdo source was never mirrored on my partial
> > mirror.  I don't know if that behavior has been fixed now, since there is
> > now a binary named jigdo, instead of jigdo-lite.
>
> My filter first converted the packages listed in the status file(s) to
> source package names (packages with different name have a "Source:"
> entry) and then output those for sources.
>
> > Also, it's more difficult for the local repository to determine the
> > difference between the automatically selected and manually selected
> > packages in this type of setup, since you would be sending a longer list
> > of "manually selected packages", instead of distinguishing which ones are
> > actually selected.  I guess that it doesn't matter much, as a package
> > would only be removed from the repository once it's not listed on any of
> > the lists.  There were times when I didn't want certain packages to be
> > removed from the repository, regardless of whether they were installed or
> > not, so I used to run xxdiff on the packages files, so the newer ones
> > were added.
>
> Same problem here. Esspecially build-depends. There where a lot of
> packages I only needed inside my build chroots and only for the time
> of the build. So they never showed up on the mirror. Then I just
> resized the mirror partition and mirrored all debs.
>
That was my ultimate solution to the problem.  I bought one of the new 
terabyte usb external drives and just mirrored the whole repository.  I had 
been satisfied to just call the problem solved at that point, but this thread 
resparked my interest in obtaining a better solution.  Before I bought the 
hard drive, I was seriously looking into getting germinate and reprepro 
working together, but once I bought the drive, I just set it all aside.  
Still, this external drive isn't portable, and my small portable drive is 
only 80G (which is more than enough for a partial mirror of source, i386, and 
amd64), so I do still need to solve the problem.  Besides, a month after I 
bought the drive, I discovered that I have a monthly cap on my transfers so 
it would be better, all around, to stop mirroring the complete repository.

> > In my way of thinking, I'm not looking to merge upstream repositories
> > together in one repository.  Besides, there are already tools, such as
> > apt-move that would be better for this job.  Long ago, apt-move was the
> > primary tool that I used to keep a local repository, and it worked pretty
> > well, as long as all the machines that were using it were on the same
> > release.
> >
> > I have found that reprepro is the absolute best tool for maintaining a
> > debian mirror.  The only problem I have with it is when I want to
> > maintain a partial mirror, and I don't want a merged repository, is that
> > I have to spread the packages lists to different places, and when you
> > start adding machines, you start adding more lists to the configuration,
> > when it would probably be better to maintain a set of "master" lists that
> > are generated from the many lists that come from the machines.
>
> Or have a proxy that adds packages that are requested.
When I woke up this morning, I was thinking that it might be interesting to 
have an apt method that talks directly to reprepro.  It's just a vague idea 
now, but I'll give it some more thought later.

>
> MfG
> Goswin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
urrently mirrored packages allow multiple subsets
> which clients using this repository might have installed)...
>
I used to have to keep outdated libraries in my filter list when I was using a 
partial sid mirror, as some packages would become uninstallable without them.  
I've learned over the course of years that you can't run from a snapshot of 
sid, but rather have to use it for a few months to get the dependencies to 
work out, even though many of those dependencies have changed versions in the 
official repository.
But really, that last paragraph is me trying to understand what you were 
saying.  You went a bit above my head, and I'm having trouble following you.

> Hochachtungsvoll,
>   Bernhard R. Link



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote:
> Joseph Rawson  writes:
> > BTW, the subject of this thread is "apt-get wrapper for maintaining
> > Partial Mirrors".  The solution I'm proposing is "a simple tool for
> > maintaining Partial Mirrors" (which could possibly be wrapped by apt-get
> > later).
> >
> > I think that just pursuing an "apt-get wrapper" leads to some
> > complications that could be avoided by creating the "partial mirror tool"
> > first, then looking at wrapping it later.  One complication might be "how
> > do handle apt-get remove", and another might be "how to handle sid
> > libraries that disappear from official repository, yet local machines
> > must have them".
>
> Ahh, so maybe I completly misread that part.
>
> Do you mean a wrapper around apt-get so that "apt-get install foo" on
> any client would automatically add "foo" to the list of packages being
> mirrored on the server?
>
> If so then you can configure a post invoke hook in apt that will copy
> the dpkg status file of the host to the server [as status.$(hostname)]
> and then use those on the server to generate the filter for
> reprepro. I think I still have a script for that somewhere but it is
> easy enough to rewrite.
>
When you mentioned the word "hook", I was reminded of reprepro's ability to 
use hooks.  I started testing using a ListHook script with reprepro.  I'm 
attaching the script so you can see the general idea.  The script doesn't do 
anything effective, but may be helpful in understanding more of the way I'm 
approaching the idea.  Please don't laugh too hard, I'm just playing with 
ideas now.

Among other possible reasons, there are two main reasons why this particular 
approach won't work.  One reason is that the ListHook calls a script for each 
list independently.  So, if you have a package in contrib that depends on a 
package in main, like many do, the dependency won't be resolved using this 
method.  Also, the germinator object only handles one arch at a time, so if 
you are mirroring multiple arches, you need to use a germinator object for 
each one.  One way that this problem can be countered is by running a simple 
server that holds the germinator object, and the script that ListHook 
executes would communicate with that server.  Then the server would "grow" 
the seeds and create the filter lists that would be used by reprepro.

I tried this approach because I didn't see the sense in downloading the 
packages lists more than necessary.  The way I was thinking before was to 
seed germinate (which would download the package lists), parse the output, 
create filter lists from that output, send them to reprepro, and call 
reprepro to update.  This forces all of those package lists to be downloaded 
twice, which was something I tried to avoid with this short experiment.

It also seems to be somewhat difficult to "plant the seeds" into germinate 
manually.  I'm sure that problem could be solved by looking through the code 
a bit longer.

> MfG
> Goswin



-- 
Thanks:
Joseph Rawson


testgerm
Description: application/python


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 05:09:31 Tzafrir Cohen wrote:
> On Fri, Jun 19, 2009 at 01:52:43AM -0500, Joseph Rawson wrote:
> > would be much more interested in making a tool that would make it easier
> > to manage local/partial debian mirrors (i.e. one that helped resolve the
> > dependencies), rather than have an apt-get wrapper.  I also think that
> > once such a tool is made, it would make it easier to build an apt-get
> > wrapper that works with it.  I don't think that viewing the problem with
> > an "apt-get wrapper" solution is the best way to approach it, but I do
> > think that it would be valuable once the underlying problems are solved.
>
> And reprepro does not fit the bill because?
>
It fits part of the bill, as it's an excellent tool for maintaining a 
repository, but it doesn't resolve dependencies (nor should it).

> --
> Tzafrir Cohen | tzaf...@jabber.org | VIM is
> http://tzafrir.org.il || a Mutt's
> tzaf...@cohens.org.il |        |  best
> ICQ# 16849754 || friend



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Thursday 18 June 2009 04:47:45 Goswin von Brederlow wrote:
> Frank Lin PIAT  writes:
> > On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote:
> >> On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
> >> > This can be stated as: if a person
> >> > wants to keep a customised set of packages for usage with the
> >> > distribution, the tool should be able to develop dependencies, fetch
> >> > packages, generate appropriate documentation and then create the
> >> > corresponding directory structure in the target mirror! The task can
> >> > be extended to include packages which are currently not under one of
> >> > the standard mirrors!
> >
> > 
> > One don't have to merge the repositories, one can just declare multiple
> > sources in /etc/apt/*
> > 
>
> Lets say I want to mirror xserver-xorg from experimental. Then I would
> want it to include xserver-xorg-core (>= xyz) also from experimental
> as the dependency dictates but not include libc6 from experimental as
> the sid one is sufficient.
>
> A key point here would be flexibility.
This is something that I haven't considered yet.  This would be one of the 
problems that might occur with the "post invoke hook" that you mentioned 
earlier using dpkg status.  Actually this wouldn't be much of a problem, I 
was confused.  I was thinking you were meaning "--get-selections" which just 
returns the name of the package and "install/deinstall", but status also 
contains the version being used, and this could be matched to the appropriate 
repository in the sources list (so you get the libc from main instead of 
experimental, since the status file uses the version that's in main).

However, I don't know how to use that info with reprepro.  With reprepro, I've 
only sent "--get-selections" lists to it.  In fact, this is how I used to 
install new packages in sid, and make sure they came from the local 
repository first.


#!/bin/bash
packages=`grep-status " install ok not-installed" | grep Package | 
gawk '{print $2}'`
#packages=`aptitude search ~N | grep ^.i | gawk '{print $2}'`
touch conf/list-uninstalled.tmp
for package in $packages 
  do echo -e "$package\t\tinstall" >> conf/list-uninstalled.tmp
done
cat conf/list-uninstalled.tmp | uniq | sort > conf/list-uninstalled
rm conf/list-uninstalled.tmp


You may be able to tell by looking at the script that I'm still in the process 
of getting used to aptitude, being a longtime dselect user. ;)
Anyway, I don't know much about determining (with reprepro) which upstream 
repository holds the version of the package that I want installed.

>
> >> > I think the tool can have immense utility in helping people automate
> >> > the task of mantaining the repositories. Suggestions, positive and
> >> > negative are invited.
> >> >
> >> > I have not included the impl details as I would first like to evaluate
> >> > the idea at a feasibility and utility level.
> >
> > If the scope of your project includes being able to bootstrap systems
> > from the mirror, resolving dependency is much more complex (some
> > packages aren't resolved by dependencies. For instance, the right kernel
> > is select by some logic in Debian-installer).
> > I found some interesting logic in debian-cd package.
>
> You would include "linux-image-" in your package list. That
> isn't really a problem of the tool. Just of the input you need to provide.
> Also you would include everything udeb and everything
> essential/required for bootstraping purposes.
>
I was also thinking along those lines, too.  Same with fam/gamin and other 
packages that have "drop-in" replacements.

> Again flexibility is the key.
>
> > Still, I don't consider that allowing bootstrapping is mandatory. Your
> > project would still be extremely valuable without it. [for those 95% of
> > the people that install from CD, as opposed to netboot].
> >
> > Regards,
> >
> > Franklin
>
> MfG
> Goswin
>
> PS: the essential/required packages can already easily be filtered
> with grep-dctrl.



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Thursday 18 June 2009 03:17:13 Frank Lin PIAT wrote:
> On Tue, 2009-06-09 at 16:16 -0500, Joseph Rawson wrote:
> > On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
> > > I had an idea in mind whereby the task of making mirrors for personal
> > > distributions can be automated.
>
> 
> Depending on what you want to achieve, a caching proxy might be an easy
> solution (there are a specialized in the archive already)
> 
>
Or possibly apt-move called as a post-invoke action of apt-get.

> > > This can be stated as: if a person
> > > wants to keep a customised set of packages for usage with the
> > > distribution, the tool should be able to develop dependencies, fetch
> > > packages, generate appropriate documentation and then create the
> > > corresponding directory structure in the target mirror! The task can
> > > be extended to include packages which are currently not under one of
> > > the standard mirrors!
>
> 
> One don't have to merge the repositories, one can just declare multiple
> sources in /etc/apt/*
> 
>
Then it becomes harder to send the package to the appropriate local 
repository, since they aren't merged.  I would also prefer to not have to 
deal with a merged repository, but keep separate upstream partial mirrors, as 
they would probably be easier to manage.

> > > I think the tool can have immense utility in helping people automate
> > > the task of mantaining the repositories. Suggestions, positive and
> > > negative are invited.
> > >
> > > I have not included the impl details as I would first like to evaluate
> > > the idea at a feasibility and utility level.
>
> If the scope of your project includes being able to bootstrap systems
> from the mirror, resolving dependency is much more complex (some
> packages aren't resolved by dependencies. For instance, the right kernel
> is select by some logic in Debian-installer).
> I found some interesting logic in debian-cd package.
>
> Still, I don't consider that allowing bootstrapping is mandatory. Your
> project would still be extremely valuable without it. [for those 95% of
> the people that install from CD, as opposed to netboot].
>
The reason that I recommended tying germinate and reprepro together with a 
tool was because the original post was discussing "personal distributions".  
To me, this implies the ability to bootstrap, and also the need to have 
a "self building" source/binary repository.

I have just made some other responses to Goswin that should help explain my 
view on things a bit better.

> Regards,
>
> Franklin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-19 Thread Joseph Rawson
On Friday 19 June 2009 00:27:06 Goswin von Brederlow wrote:
> Joseph Rawson  writes:
> > BTW, the subject of this thread is "apt-get wrapper for maintaining
> > Partial Mirrors".  The solution I'm proposing is "a simple tool for
> > maintaining Partial Mirrors" (which could possibly be wrapped by apt-get
> > later).
> >
> > I think that just pursuing an "apt-get wrapper" leads to some
> > complications that could be avoided by creating the "partial mirror tool"
> > first, then looking at wrapping it later.  One complication might be "how
> > do handle apt-get remove", and another might be "how to handle sid
> > libraries that disappear from official repository, yet local machines
> > must have them".
>
> Ahh, so maybe I completly misread that part.
>
It was my fault for not making this point clear, as I should've done.  FWIW, I 
would be much more interested in making a tool that would make it easier to 
manage local/partial debian mirrors (i.e. one that helped resolve the 
dependencies), rather than have an apt-get wrapper.  I also think that once 
such a tool is made, it would make it easier to build an apt-get wrapper that 
works with it.  I don't think that viewing the problem with an "apt-get 
wrapper" solution is the best way to approach it, but I do think that it 
would be valuable once the underlying problems are solved.

> Do you mean a wrapper around apt-get so that "apt-get install foo" on
> any client would automatically add "foo" to the list of packages being
> mirrored on the server?
>
It was the original poster who mentioned the apt-get wrapper, but I took it to 
mean exactly what you said above.  The tool I was envisioning would take a 
short list of packages (a text file with package names separated by newlines, 
or a collection of such text files) combined with a list of apt sources and 
generate the partial mirror from just that information.  There are still some 
things that should be explicitly included in those lists, such as either 
gamin, fam, or both, as an example.

> If so then you can configure a post invoke hook in apt that will copy
> the dpkg status file of the host to the server [as status.$(hostname)]
> and then use those on the server to generate the filter for
> reprepro. I think I still have a script for that somewhere but it is
> easy enough to rewrite.
That's good for binaries, but I don't know about the source.  It wasn't long 
ago that I noticed a problem with reprepro not obtaining the corresponding 
source packages when you use a filter list taken 
from  "dpkg --get-selections".  I remember that the source for jigdo wasn't 
in my partial mirror, because there were no binaries named "jigdo", 
rather "jigdo-file" and "jigdo-lite".  Since there were no sources with that 
name, the jigdo source was never mirrored on my partial mirror.  I don't know 
if that behavior has been fixed now, since there is now a binary named jigdo, 
instead of jigdo-lite.

Also, it's more difficult for the local repository to determine the difference 
between the automatically selected and manually selected packages in this 
type of setup, since you would be sending a longer list of "manually selected 
packages", instead of distinguishing which ones are actually selected.  I 
guess that it doesn't matter much, as a package would only be removed from 
the repository once it's not listed on any of the lists.  There were times 
when I didn't want certain packages to be removed from the repository, 
regardless of whether they were installed or not, so I used to run xxdiff on 
the packages files, so the newer ones were added.

In my way of thinking, I'm not looking to merge upstream repositories together 
in one repository.  Besides, there are already tools, such as apt-move that 
would be better for this job.  Long ago, apt-move was the primary tool that I 
used to keep a local repository, and it worked pretty well, as long as all 
the machines that were using it were on the same release.

I have found that reprepro is the absolute best tool for maintaining a debian 
mirror.  The only problem I have with it is when I want to maintain a partial 
mirror, and I don't want a merged repository, is that I have to spread the 
packages lists to different places, and when you start adding machines, you 
start adding more lists to the configuration, when it would probably be 
better to maintain a set of "master" lists that are generated from the many 
lists that come from the machines.

>
> MfG
> Goswin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-18 Thread Joseph Rawson
On Thursday 18 June 2009 02:46:42 Goswin von Brederlow wrote:
> Joseph Rawson  writes:
> > There is another application that will help with the dependencies.  It's
> > called germinate, and it will take a short list of packages and a list of
> > repositories and build a bunch of different lists of packages and their
> > dependencies.  Germinate will also determine build dependencies for those
> > packages and recursively build a list of builddeps and the builddeps'
> > builddeps.
> >
> > I have thought of making an application that would get germinate and
> > reprepro to work together to help build a decent partial mirror that had
> > the correct set of packages, but the process was a bit time consuming. 
> > It's been a while
>
> Was it that bad? It only needs to run 4 times a day when the mirror
> push comes in.
>
It wasn't the running that was time consuming, but the writing of all the code 
to seed germinate, then try and use the results for reprepro.  I'm sorry if I 
wasn't clear on which part was consuming time.

> > since I've worked on this, since my temporary solution to the problem was
> > to buy a larger hard drive.  Currently, I have a full mirror that I keep
> > updated, and a repository of locally built packages next to it.  I'm not
> > really happy with this solution, as it uses too much disk space and I'm
> > downloading packages that will never be used, but it's given me time to
> > tackle more important problems.
> >
> > Before writing any code, I would recommend taking a look at both reprepro
> > and germinate, as each of these applications is good at solving half of
> > the problems you describe.  I think that an ideal solution would be to
> > write a frontend program that takes a list of packages and upstream
> > repositories, feeds them into germinate, obtains the result from
> > germinate, parse those results and build a reprepro configuration from
> > that, then get reprepro to fetch the appropriate packages.
>
> Combining germinate and reprepro is the right thing to do. Or reprepro
> and a new filter instead of germinate. But don't rewrite reprepro.

I never intended to rewrite reprepro.  It does it's job very well.  It's not 
reprepro's job to resolve dependencies, nor should it be, as a dependency 
could lie in an entirely different repository.

I do think that since each program has it's specific area of responsibility, 
that a program that glues them together would be appropriate, and help from 
reinventing wheels when it's not necessary.

>
> Given a little bit of care when writing the reprepro config this can
> be completly done as part of the filtering. There is no need for a
> seperate run that scanns all upstream repositories as long as you can
> define a partial order between them, i.e. contrib needs things from
> main but main never from contrib. That would also have the benefit
> that you only need to process those packages files that have changed.
>
> > I would be happy to help with this, as I could use such an application,
> > and I already have a meager bit of python code that parses the output of
> > germinate (germinate uses a wiki-type markup in it's output files).  I
> > stopped working on the code since I bought a new hard drive, since I just
> > used the extra space to solve the problem for me, but I can bring it back
> > to life, as I would desire to use a more correct solution.
>
> Urgs, that sucks. It should take a Packages/Sources style input and
> output the same format.
>
I don't like the output either, but I haven't taken much time to dig into the 
germinate code very much.
> Maybe rewriting it using libapt would be better than wrapping germinate.
Germinate uses libapt.  It imports apt_pkg from the python-apt package, which 
is a python binding to libapt, AFAIK.  It might be easier to just 
add '/usr/lib/germinate' to the sys.path and control the Germinator object 
directly, bypassing the way that the package lists are output from germinate.

Germinate does have an advantage in that it can recursively add the builddeps 
for a package list, making a list for a partial, self-building mirror.

BTW, the subject of this thread is "apt-get wrapper for maintaining Partial 
Mirrors".  The solution I'm proposing is "a simple tool for maintaining 
Partial Mirrors" (which could possibly be wrapped by apt-get later).  

I think that just pursuing an "apt-get wrapper" leads to some complications 
that could be avoided by creating the "partial mirror tool" first, then 
looking at wrapping it later.  One complication might be "how do handle 
apt-get remove", and another might be "how to handle sid libraries that 
disappear from official repository, yet local machines must have them".

>
> MfG
> Goswin



-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


Re: apt-get wrapper for maintaining Partial Mirrors

2009-06-09 Thread Joseph Rawson
On Tuesday 09 June 2009 13:14:53 sanket agarwal wrote:
> Hi all,
>
> We all know that there are various distro's that build around Debian.
> I had an idea in mind whereby the task of making mirrors for personal
> distributions can be automated. This can be stated as: if a person
> wants to keep a customised set of packages for usage with the
> distribution, the tool should be able to develop dependencies, fetch
> packages, generate appropriate documentation and then create the
> corresponding directory structure in the target mirror! The task can
> be extended to include packages which are currently not under one of
> the standard mirrors!
>
> I think the tool can have immense utility in helping people automate
> the task of mantaining the repositories. Suggestions, positive and
> negative are invited.
>
> I have not included the impl details as I would first like to evaluate
> the idea at a feasibility and utility level.

I have been working on this idea myself for quite a while, but I haven't 
messed with the problem recently.  I was using reprepro to maintain partial 
mirrors, but it required using the output from "dpkg --get-selections" from 
almost every machine that I needed to mirror packages for.  The reprepro 
program is excellent for making partial mirrors, but it has a drawback in 
that it doesn't help resolve dependencies.  This means that you can't just 
make a short list of packages and easily build a partial mirror that contains 
those packages and their dependencies, rather you have to install a machine 
with those packages and use the list of packages from that machine with 
reprepro to get a decent mirror.

There is another application that will help with the dependencies.  It's 
called germinate, and it will take a short list of packages and a list of 
repositories and build a bunch of different lists of packages and their 
dependencies.  Germinate will also determine build dependencies for those 
packages and recursively build a list of builddeps and the builddeps' 
builddeps.

I have thought of making an application that would get germinate and reprepro 
to work together to help build a decent partial mirror that had the correct 
set of packages, but the process was a bit time consuming.  It's been a while 
since I've worked on this, since my temporary solution to the problem was to 
buy a larger hard drive.  Currently, I have a full mirror that I keep 
updated, and a repository of locally built packages next to it.  I'm not 
really happy with this solution, as it uses too much disk space and I'm 
downloading packages that will never be used, but it's given me time to 
tackle more important problems.

Before writing any code, I would recommend taking a look at both reprepro and 
germinate, as each of these applications is good at solving half of the 
problems you describe.  I think that an ideal solution would be to write a 
frontend program that takes a list of packages and upstream repositories, 
feeds them into germinate, obtains the result from germinate, parse those 
results and build a reprepro configuration from that, then get reprepro to 
fetch the appropriate packages.

I would be happy to help with this, as I could use such an application, and I 
already have a meager bit of python code that parses the output of germinate 
(germinate uses a wiki-type markup in it's output files).  I stopped working 
on the code since I bought a new hard drive, since I just used the extra 
space to solve the problem for me, but I can bring it back to life, as I 
would desire to use a more correct solution.

-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


[Ann] Paella

2009-05-28 Thread Joseph Rawson
Hello!

Long ago, I made an announcement about paella, a project that I had just 
started to develop:

http://lists.debian.org/debian-devel/2003/09/msg01585.html

During much of this time, running paella required a small network of machines 
holding the infrastructure necessary to do automated installs.  Times have 
changed since then.  Almost one year ago, I purchased a laptop for about $670.  
The laptop came with a dual-core cpu, 3G of ram, and a large hard drive.  
Once I got the laptop, I decided it was time to start using virtual machines 
to provide an infrastructure for working with paella.  I started to create a 
new quickstart guide that helps bootstrap a minimal infrastructure that 
paella could work within.  I feel that since the newer computers have 
sufficient resources to power a small virtual network, it would be easier for 
people to get started with paella.

Paella has been moved, since the initial announcement.  It is now located 
here:

http://paella.berlios.de/

I'm not experienced in advertising or marketing, so the website is a bit dull.  
It should contain enough documentation to give a good idea of what paella is, 
how it's designed, and how to use it.  The quickstart guide is located here:

http://paella.berlios.de/docs/quickstart-vbox.html

I have tried to write the quickstart document to guide the reader as quickly 
as possible to ending up with a minimum infrastructure required to use 
paella.  At the end of the guide, once the preparations are made, a quick 
tutorial on using paella helps the reader install another virtual machine 
very similar to the one built during the first part of the install.  It 
should take about 2-3 hours to complete the quickstart guide.  Short of 
uploading a pre-made disk image somewhere, I can't figure out a more 
convenient method of starting out with paella.

In the beginning, when I started paella, there were very few tools that 
existed to help with making local debian repositories, or live nfsroot 
systems.  As a result, I spent a good deal of time implementing my own 
solutions to these problems.  As time went by, and better tools became 
available, I started to prune the code in favor of some of those tools.  With 
the help of tools, like reprepro and live-helper, I can now concern myself 
with just the installation and data management aspects of paella, which 
really helps.

At this time, the most important parts of paella are complete.  The database 
schema is fairly stable, and not likely to change in any way that's 
important.  The structure of the xml files is also pretty solid, and also not 
likely to change in any appreciable way.  Most of the management gui is 
complete, at least enough to not have to use another database manager (or 
straight SQL) for most common operations.  The installer objects have also 
been redesigned and tested quite a bit.  The operation of the installers is 
not likely to change, but some steps may be added or removed, although most 
likely not in a way that would break most configurations.

In the past, it may have seemed that paella was dead.  This is mostly because 
I'm not very vocal when it comes to advocating it, and I've not put very much 
documentation online.  Since 2004, I have used paella to install many servers 
and a few desktops for small businesses.  So, paella has been used in a 
working environment, and it has helped me earn some money.  It's no magic 
bullet, as it can take quite a while to create and test a configuration, 
however once you have things set up, the time it took to create the 
configuration can really pay off.

This is a good time for people who might be interested in paella to take a 
good look at what's been accomplished, and for those who would like to use 
it, to help direct the rest of the work that will need to be done before a 
stable 1.0.0 version is released.  I have a page where the future direction 
of paella is described:

http://paella.berlios.de/docs/plans.html

I want to thank all the debian developers for creating a very good system that 
makes it easier for me to do the work that I've been doing.  In the last few 
years, the focus on using debconf, and getting packages to install without 
manual intervention has allowed me to remove quite a number of hacks that 
used to be used to coerce some packages to be installed without intervention.  
This has helped me concentrate more on getting packages configured, rather 
than have the time divided between getting the package installed correctly, 
and then configuring the package.  I haven't had to use an expect script in a 
long time now!

-- 
Thanks:
Joseph Rawson


signature.asc
Description: This is a digitally signed message part.


paella

2003-09-30 Thread Joseph Rawson
abstract

Paella is system for creating/installing customized debian systems.
There are no current plans for configuration updates on live systems,
that's not it's initial purpose, although it may be extended later.

Paella is still in a planning state, and the configuration ability is 
real simple right now.

Inspiration for paella comes from many places:

debconf -- an important abstract configuration system
which is sadly not yet supported by paella (can somebody help
me?)
debconf is fully planned to be supported, i just haven't
figured out the best way to template a debconf template
and whether to crete a flat db, use a dir/db over nfs, or
ldap (which sounds like the best shot, though i'm unfamiliar
with it)
paella is not a replacement for debconf.

fai -- fully automatic install
an excellent installer, with the capability of bootstrapping
an entire network!  paella was going to just complement fai,
but I'm starting to prefer the configuration layout i'm
planning,
so paella will probably completely replace fai

Knoppix -- and Morphix, Gnoppix, and other live cd's
paella will eventually be a jigdo-ish like description
of the cd, so it will be easier to construct the cd's
from a local repository, and modify them before creation

OpenZaurus -- and Matt Zimmerman's efforts with debian-handhelds
building oz is a pita, i'm not used to bitkeeper and I can't
find a specific build-root for a specific release.  It's like
the 
source is available, but not specifically per package.  Overall
though, i have found the openzaurus cross compiling solution
to be the most preferable way to build source for an
embedded device.  My solution now is to use paella,
and pbuilder (thanks, for the work in getting so many
packages to autobuild! ;)) to define, bootstrap, and control
cross compiling pbuilders with autobuilt toolchains.  This
should
make it easier to script autobuilds of an embedded systems,
patches and all :)

Demudi, Debian-Lex, ...
I think paella can be very instrumental in helping configure and
install a custom network on a per network type basis, (i.e. what
machines are on the network and what are there jobs, expected
activities, etc.  I am also thinking of networks with custom
configured
roaming pda's, laptops, or whatever can take a debian system

and that is also probably the order i will be working in.

I am aware that customized packages can be made and distributed (like
tasksel and jablicator), but it is not paella's designed purpose to be
used
in that manner.  Once paella is done configuring and installing, it
should leave
no trace of itself.  The user should be left with a clean, customized
debian 
system.  This is mainly a tool meant for somebody who is going to be
creating,
configuring, and installing many debian systems.

I have tried to make a common denominator that most of these pojects
share,
but seem to be missing.

My urgent need to get this system working so i can starting earning a
living
with it, has caused to code to wind up a bit sloppy. Sorry.

all of the code is in cvs, the schema has been changing too much for any
kind of release.  I would also like to make a request for comments
before making a release, as i don't want to do something really stupid
:).  I am still a newbie with the release and versioning stuff.

anyway, the project is at http://sourceforge.net/projects/paella/

btw, paella is like a little bit of everything, with debian as rice.