[gentoo-portage-dev] Constraint-Based Dependency Solver: initial results

2019-08-30 Thread michael . lienhardt
Dear all,

It's possible that the goal of my current work and its possible benefit for the 
gentoo community are not very clear.
In my experience, emerge is a very good tool to install new packages whose use 
flags have already been configured.
However, when the packages are not correctly configured, emerge can guess, 
without guarantee, some use flags to select/unselect; and unmerging a package 
seems to be the user's entire responsability.
My work has several goals:
 - provide a correct and complete implementation of a dependency solver for 
portage that can help debug emerge
 - since the solver is correct and complete, it would provide a "safe" 
implementation of unmerge, where missing dependencies would be satisfied by the 
installation of new packages
 - provide a rather small code base that is easy to debug and on which it is 
easy to prototype some tools, like REQUIRED_USE checks, or interactive use flag 
configuration
 - be usable, both in term of memory usage and computation time.

I finished implementing the solver and did some preliminary benchs (1000 tests 
where I checked if a random set of packages -- different for each test -- could 
be installed), including comparison with emerge in "search" modality, i.e., 
with the options --autounmask y --autounmask-continue y --autounmask-backtrack y
Since this is preliminary, I wanted to talk to you about it but I don't have 
every bugs identified yet.
In average, my solver is a bit less than 10 times slower than emerge (not very 
good, but not bad as it is still far better than a brute force approach, which 
is more than 100 times slower and takes 3Gb of memory).
It is important to note that my solver is not suited for end user usage yet 
(the answer it gives is correct, but random -- it includes useless packages, 
useless package removal and old versions), but it is the first tool that I know 
of that can correctly and completely (modulo bugs ^^) check emerge's behavior.

A first result: in more than 25% of the tests that can be installed, emerge 
fails.
Most of these failures come from the fact that even in "search" modality, 
emerge stops when it sees a REQUIRED_USE that is not correctly configured (I'll 
post a bug report about it and about the SLOT heuristics in a few days).
I still need to look at the other failures to see what caused them.

Additionally, it seems that when I do "emerge package1 package2", emerge first 
solves the dependencies of package1, and then of package2.
It seems that when resolving the depenendencies of package2, emerge can end up 
with a conflict between the choices it made for solving the dependencies of 
package1 and the requirements of package2.
I say "seems" because my tests were automatically done with a rather long list 
of packages (so other reasons for the observed failures are possible).
I'll try to pin down the actual emerge behavior and possibly file a bug report.

I am currently porting the tests on docker (to have a safe testing environment) 
and once that's done, I will be able to give actual bug reports.
In the future, I have the following plan:
 - cleanup the output of the solver, so it wouldn't be random and be usable 
instead (this is "just" a technical and boring algorithm to implement, but time 
consuming)
 - install the configuration computed by my solver (I am still unsure that 
emerge --nodeps would be flexible enough, and maybe I will have to implement my 
own planner)
 - find a correct abstraction of packages, similar to the SLOT heuristics ( 
https://gitweb.gentoo.org/proj/portage.git/commit/?id=a9064d08ef4c92a5d0d1bfb3dc8a01b7850812b0
 ), to improve efficiency
 - design an interactive package configuration algorithm + UI that would happen 
during the dependency solving process and really help the user configuring what 
he wants and let the solver do the rest
 - start reading portage's bug tracker and continue reading its code
 - extend pdepa with other kind of relevant analysis

All comments/suggestions are welcomed.

Best,
Michael



Re: [gentoo-portage-dev] Constraint-Based Dependency Solver: initial results

2019-10-08 Thread Michael Orlitzky
On 8/30/19 10:34 AM, michael.lienha...@laposte.net wrote:
> 
> All comments/suggestions are welcomed.
> 

Since no one else has said it yet (?), I think this approach is really
cool and I'm glad someone is working on it.

Modeling difficult computations as abstract optimization problems is a
"hobby" of mine, and I think it will pay off if we can abstract away a
big ugly part of the PM and try to swap it out for something more
principled.



Re: [gentoo-portage-dev] Constraint-Based Dependency Solver: initial results

2019-10-08 Thread Zac Medico
On 8/30/19 7:34 AM, michael.lienha...@laposte.net wrote:
>  - install the configuration computed by my solver (I am still unsure that 
> emerge --nodeps would be flexible enough, and maybe I will have to implement 
> my own planner)

One problem with emerge --nodeps is that it does not resolve file
collisions in cases where two packages block each other. Normally, such
collisions are resolved by removing the files from the contents of the
installed package (the installed package is ultimately unmerged in order
to solve the blocker), but with --nodeps the blockers are simply
discarded and emerge treats all file collisions as fatal.

Another problem is that --nodeps eliminates the dependency information
that's needed for scheduling parallel builds with emerge --jobs.
-- 
Thanks,
Zac



signature.asc
Description: OpenPGP digital signature