Re: portupgrade O(n^m)?

Sean Bryant Wed, 28 Feb 2007 23:14:29 -0800

[EMAIL PROTECTED] wrote:

On Thu, 15 Feb 2007, Michel Talon wrote:
Give me a few weeks, and if I can band together with a few people I
wanted to try and port sections of portupgrade and its related tools to
C++ (and maybe do some code tweaks along the way). Most of the ruby
files are over 400 lines long, sparsely commented, and I don't know ruby
enough to port right now, but I've been making some headway lately so
I'll try porting some stuff soon.
I think that porting portupgrade to C++ would be time spent in vain. In
my opinion, some of the basic ideas of portupgrade are deeply flawed,
and as much as one polishes the algorithms it will not gain much. The
idea of keeping state in databases is deeply flawed, it is constantly
broken, and doesn't help in speed at all. This was one of the
motivations of portmaster, get rid of database dependencies. In my
opinion, upgrading progressiveley, that is, port by port, is deeply
flawed. There is 90% chance that something will go wrong in the middle
and you will be stuck with an half upgraded system.

So in my opinion, what is needed is thinking radically new about the
problem, write a prototype in a scripting language to experiment with
the solutions, and then code it in C++. Personnally i have done that, i
have written a python script, which can be found here:
http://www.lpthe.jussieu.fr/~talon/pkgupgrade
(it needs the companion
http://www.lpthe.jussieu.fr/~talon/save_pkg.py).
For the time being, i still have bugs, that i am working on, but at
least these bugs show that the problem is vastly more complicated that
one can imagine at first.

Why python? because it is much more readable than perl or ruby, and much
more performant than ruby. In may opinion ruby is vastly hyperhyped, it
is much closer to rubish than anything else.
What ideas? Don't use any database, database connector, do everything
in memory, recompute needed information on the fly. It works very well,
one can count on something of the order of 1mn to 2mn to perform the
necessary analysis for 700 ports. Second, download as much precompiled
packages as possible, at full speed, that is with the same connection to
the ftp server. This works very well, if you have a good internet
connection, in 15 mn to 20 mn you have your packages.

Why packages?
because packages don't break when compiling. Compiling from source is
asking for problems. If you minimise the number of compilations you
minimise the risk of breakage. Moreover simultaneously with downloading
one can backup old packages, and so, gain time. By contrast, for every
packages, portupgrade first does dependency analysis that could be done
once, then does backup, then fetches the binary package or compiles,
then installs it, then discards backup. Al this is terrible loss of
time.

Finally my script produces a shell script able to do the upgrade. So you
can look in written form to *exactly* what will be removed, what will be
installed by binary packages, and what will be compiled. All necessary
packages for installation are already present on the machine. There is
absolutely no element of surprise, you can evaluate the risk soundly.
These are the ideas i have explored.

Now, performance wise, when you run the shell script it takes around 2
hours. This is entirely time spent by pkg_delete ( roughly 15 mn) and
pkg_add (roughly 1h45mn) for around 500 ports replaced. This is very
long, sure, but it can be optimized only by working on pkg_delete and
pkg_add. No amount of work on portupgrade or a replacement will help in
any way.

As for the remaining bugs i have, they are entirely due to the crappy
complexity that FreeBSD port developers introduce by constantly
modifying the origins of the ports. So for a given program, i can have 3
different origins, one when the port was previously installed on the
machine, another one when the last RELEASE was produced, and the last
one if i compile now the port on the machine with the present state of
the ports tree. These 3 origins may be different, i have examples.
These morons are *constantly* modifying the names, as an exercice in
bikeshed painting. For example pan -> pan2 -> pan, etc. Cycles don't
worry them at all!
Of course, for a given software, you may have all combinations, such as
inexistant or existant at the time the machine was installed, at the
time of the release, or at present.

Compare that to the situation for Debian apt-get. The names are
conserved. They have strict rules about package naming, they stick to
them and don't change them arbitrarily. All packages exist in compiled
form, you don't have to worry about prepackaged or "to be compiled, so
has 50% chance to break". You have only 2 states to consider instead of
3: the state on the machine and the state on the repository. Things are
vastly simpler. No wonders that apt-get works and portupgrade doesn't.
This has nothing to do with the fact that apt-get is written in C++
(sorry to cross post, but this thread is just as relevant to @ports asit is to @hackers)
Well, since you brought up Debian's apt-get system I thought it'd be agood idea to take a look at the Gentoo Linux emerge / portage system(patterned after Freebsd):
=====
Pros:
=====
-It's written in python (portable).
-It's a system which focuses on ports compilation from source, notbinary package installation.-Stores information in a db format (not Berkeley DB, but somethingdifferent)for entire system in a common file; stores installed leafpackage information in another simple textfile.-Has flags for stability reasons, since some packages are alpha or betaand don't compile under certain architectures.
-Portage files are fetched via rsync.
-Has separate portage files which are phased out over time, in case theportage maintainers move the files in one release. The maintainers thencreate an informative message which describes what's going on whileemerging the package or going through the portage database. If possiblethe outdated package is pruned and the newer, more recent dependency ismerged.
=====
Cons:
=====
-It's written in python (not fast).
-Uses rsync.

======
Point:
======
Apart from what's listed in the above paragraph, Gentoo's portage mayhave several things that are better than FreeBSD's port system:
-Limited life cycle for versioning, which doesn't force server / desktopowners to fix a number of machines all at once, but instead gives them aheads up before a big change occurs and automatically unmerges olddependencies and emerges new items, if possible.-One common interface for package / portage management--not 10 littletools which do basically the same thing, or are specialized for specifictasks.-One common file for all installed packages / ports, not a series ofdirectories and files.-Separate versioning for files, which doesn't break things nearly asmuch as one common ports Makefile for each file.-A means to search for portage items and their descriptions, withouthaving to deal with a tool that doesn't really work reliably.
It's not so much that I'm trying to bash on freebsd, but there'sdefinitely a revision that needs to be made to the way that ports /packages are done, because it seems that the commitee in charge of portsplanning and the overall roadmap seem to have let things get a bit offtrack, just because of the sheer number of ports items available.Something can be fixed and should be. I can only do a portion of theload myself in so much time, since I'm going to work and school right now.
=======
In light of previous statement:
=======
I wasn't trying to port the pkg_* and port* utils to C++ thinking that Iwould magically get more optimized code. Sure, C++ is much better thanruby at optimizations if done correctly, but C++ is also easier to screwup than ruby or perl or python, because you have the power to shootyourself in the foot easier (not as much as C or ASM, but close).
The point was that with C++ we could finally get a set of standardizedtools and a common interface for FreeBSD for managing ports / packageswhich could be included in the base system, not a bunch of littlespecialized tools and packages.
I'll have to approach this problem from a black box perspective and becarefully in planning this out, but my goal is to be as backwardscompatible friendly as possible or at least provide migration tools toease the move from the old system to the new one.
Again, if anyone is interested in helping me out, it would be more thanwelcome. That way we could ensure that the project gets done in a timelymanner and can reduce bugs and think of better solutions (more peoplecan help in thinking out of the box, the larger the group).
Thanks,
-Garrett

PS Please reply on the @hackers list, if possible.

_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Honestly I'd be more interested in a package building system. Maybe be alittle bit more liberal in the default building of ports. It doesn'tneed to build a package of every port just common ones. That way itseasier to get up and running with things. Things like xorg, gnome andKDE take ages to build and would be awesome if there was a decentpackage fetching system. Something like apt-get where you could add somekind of repository. and you could just pull down a list of packages andchoose what you want. This can be emulated in a way using portupgrade -Pand changing the pkgtools.conf to have some more mirrors to fetch from apointyhat macro is there but probably shouldn't be abused as its thereto look for problems not build us consumers packages it just a sideeffect or at least this is how it was explained to me. A neat thingmight be a distributed package building project. Where packages arepicked apart and pieces are built all over the place get enough placesto donate CPU and package building might be a thing of the past, butthose are just pipe dreams right now.

The slowness affects me after a mass upgrade, after that I'm fine. Maybesomeone can look into profiling portupgrade and seeing if its withportupgrade or the pkg_* tools.




_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Re: portupgrade O(n^m)?

Reply via email to