[EMAIL PROTECTED] wrote:
On Thu, 15 Feb 2007, Michel Talon wrote:

Give me a few weeks, and if I can band together with a few people I
wanted to try and port sections of portupgrade and its related tools to
C++ (and maybe do some code tweaks along the way). Most of the ruby
files are over 400 lines long, sparsely commented, and I don't know ruby
enough to port right now, but I've been making some headway lately so
I'll try porting some stuff soon.

I think that porting portupgrade to C++ would be time spent in vain. In
my opinion, some of the basic ideas of portupgrade are deeply flawed,
and as much as one polishes the algorithms it will not gain much. The
idea of keeping state in databases is deeply flawed, it is constantly
broken, and doesn't help in speed at all. This was one of the
motivations of portmaster, get rid of database dependencies. In my
opinion, upgrading progressiveley, that is, port by port, is deeply
flawed. There is 90% chance that something will go wrong in the middle
and you will be stuck with an half upgraded system.

So in my opinion, what is needed is thinking radically new about the
problem, write a prototype in a scripting language to experiment with
the solutions, and then code it in C++. Personnally i have done that, i
have written a python script, which can be found here:
http://www.lpthe.jussieu.fr/~talon/pkgupgrade
(it needs the companion
http://www.lpthe.jussieu.fr/~talon/save_pkg.py).
For the time being, i still have bugs, that i am working on, but at
least these bugs show that the problem is vastly more complicated that
one can imagine at first.

Why python? because it is much more readable than perl or ruby, and much
more performant than ruby. In may opinion ruby is vastly hyperhyped, it
is much closer to rubish than anything else.
What ideas? Don't use any database, database connector, do everything
in memory, recompute needed information on the fly. It works very well,
one can count on something of the order of 1mn to 2mn to perform the
necessary analysis for 700 ports. Second, download as much precompiled
packages as possible, at full speed, that is with the same connection to
the ftp server. This works very well, if you have a good internet
connection, in 15 mn to 20 mn you have your packages.

Why packages?
because packages don't break when compiling. Compiling from source is
asking for problems. If you minimise the number of compilations you
minimise the risk of breakage. Moreover simultaneously with downloading
one can backup old packages, and so, gain time. By contrast, for every
packages, portupgrade first does dependency analysis that could be done
once, then does backup, then fetches the binary package or compiles,
then installs it, then discards backup. Al this is terrible loss of
time.

Finally my script produces a shell script able to do the upgrade. So you
can look in written form to *exactly* what will be removed, what will be
installed by binary packages, and what will be compiled. All necessary
packages for installation are already present on the machine. There is
absolutely no element of surprise, you can evaluate the risk soundly.
These are the ideas i have explored.

Now, performance wise, when you run the shell script it takes around 2
hours. This is entirely time spent by pkg_delete ( roughly 15 mn) and
pkg_add (roughly 1h45mn) for around 500 ports replaced. This is very
long, sure, but it can be optimized only by working on pkg_delete and
pkg_add. No amount of work on portupgrade or a replacement will help in
any way.

As for the remaining bugs i have, they are entirely due to the crappy
complexity that FreeBSD port developers introduce by constantly
modifying the origins of the ports. So for a given program, i can have 3
different origins, one when the port was previously installed on the
machine, another one when the last RELEASE was produced, and the last
one if i compile now the port on the machine with the present state of
the ports tree. These 3 origins may be different, i have examples.
These morons are *constantly* modifying the names, as an exercice in
bikeshed painting. For example pan -> pan2 -> pan, etc. Cycles don't
worry them at all!
Of course, for a given software, you may have all combinations, such as
inexistant or existant at the time the machine was installed, at the
time of the release, or at present.

Compare that to the situation for Debian apt-get. The names are
conserved. They have strict rules about package naming, they stick to
them and don't change them arbitrarily. All packages exist in compiled
form, you don't have to worry about prepackaged or "to be compiled, so
has 50% chance to break". You have only 2 states to consider instead of
3: the state on the machine and the state on the repository. Things are
vastly simpler. No wonders that apt-get works and portupgrade doesn't.
This has nothing to do with the fact that apt-get is written in C++

(sorry to cross post, but this thread is just as relevant to @ports as it is to @hackers)

Well, since you brought up Debian's apt-get system I thought it'd be a good idea to take a look at the Gentoo Linux emerge / portage system (patterned after Freebsd):

=====
Pros:
=====
-It's written in python (portable).
-It's a system which focuses on ports compilation from source, not binary package installation. -Stores information in a db format (not Berkeley DB, but something different)for entire system in a common file; stores installed leaf package information in another simple textfile. -Has flags for stability reasons, since some packages are alpha or beta and don't compile under certain architectures.
-Portage files are fetched via rsync.
-Has separate portage files which are phased out over time, in case the portage maintainers move the files in one release. The maintainers then create an informative message which describes what's going on while emerging the package or going through the portage database. If possible the outdated package is pruned and the newer, more recent dependency is merged.

=====
Cons:
=====
-It's written in python (not fast).
-Uses rsync.

======
Point:
======
Apart from what's listed in the above paragraph, Gentoo's portage may have several things that are better than FreeBSD's port system:

-Limited life cycle for versioning, which doesn't force server / desktop owners to fix a number of machines all at once, but instead gives them a heads up before a big change occurs and automatically unmerges old dependencies and emerges new items, if possible. -One common interface for package / portage management--not 10 little tools which do basically the same thing, or are specialized for specific tasks. -One common file for all installed packages / ports, not a series of directories and files. -Separate versioning for files, which doesn't break things nearly as much as one common ports Makefile for each file. -A means to search for portage items and their descriptions, without having to deal with a tool that doesn't really work reliably.

It's not so much that I'm trying to bash on freebsd, but there's definitely a revision that needs to be made to the way that ports / packages are done, because it seems that the commitee in charge of ports planning and the overall roadmap seem to have let things get a bit off track, just because of the sheer number of ports items available. Something can be fixed and should be. I can only do a portion of the load myself in so much time, since I'm going to work and school right now.

=======
In light of previous statement:
=======

I wasn't trying to port the pkg_* and port* utils to C++ thinking that I would magically get more optimized code. Sure, C++ is much better than ruby at optimizations if done correctly, but C++ is also easier to screw up than ruby or perl or python, because you have the power to shoot yourself in the foot easier (not as much as C or ASM, but close).

The point was that with C++ we could finally get a set of standardized tools and a common interface for FreeBSD for managing ports / packages which could be included in the base system, not a bunch of little specialized tools and packages.

I'll have to approach this problem from a black box perspective and be carefully in planning this out, but my goal is to be as backwards compatible friendly as possible or at least provide migration tools to ease the move from the old system to the new one.

Again, if anyone is interested in helping me out, it would be more than welcome. That way we could ensure that the project gets done in a timely manner and can reduce bugs and think of better solutions (more people can help in thinking out of the box, the larger the group).

Thanks,
-Garrett

PS Please reply on the @hackers list, if possible.

_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"
Honestly I'd be more interested in a package building system. Maybe be a little bit more liberal in the default building of ports. It doesn't need to build a package of every port just common ones. That way its easier to get up and running with things. Things like xorg, gnome and KDE take ages to build and would be awesome if there was a decent package fetching system. Something like apt-get where you could add some kind of repository. and you could just pull down a list of packages and choose what you want. This can be emulated in a way using portupgrade -P and changing the pkgtools.conf to have some more mirrors to fetch from a pointyhat macro is there but probably shouldn't be abused as its there to look for problems not build us consumers packages it just a side effect or at least this is how it was explained to me. A neat thing might be a distributed package building project. Where packages are picked apart and pieces are built all over the place get enough places to donate CPU and package building might be a thing of the past, but those are just pipe dreams right now.

The slowness affects me after a mass upgrade, after that I'm fine. Maybe someone can look into profiling portupgrade and seeing if its with portupgrade or the pkg_* tools.



_______________________________________________
freebsd-hackers@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "[EMAIL PROTECTED]"

Reply via email to