Re: SAT based britney - formalisation of the problem
Hi, Am Sonntag, den 17.07.2011, 12:08 +0100 schrieb Adam D. Barratt: > On Sun, 2011-07-17 at 12:41 +0200, Joachim Breitner wrote: > > And can one specify also > > individual binary packages (e.g. after a binNMU) for transition? > > http://release.debian.org/britney/hints/README mention only > > source-package/version tuples. > > One can't specify binary packages. The syntax > "source-package/arch/source-version" is used for migrating binNMUs and > roughly translates as "the set of binaries built from (source) version > of on ". ok, thanks. That should work as well. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney - formalisation of the problem
On Sun, 2011-07-17 at 12:41 +0200, Joachim Breitner wrote: > And can one specify also > individual binary packages (e.g. after a binNMU) for transition? > http://release.debian.org/britney/hints/README mention only > source-package/version tuples. One can't specify binary packages. The syntax "source-package/arch/source-version" is used for migrating binNMUs and roughly translates as "the set of binaries built from (source) version of on ". Regards, Adam -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/1310900896.3606.26.ca...@hathi.jungle.funky-badger.org
Re: SAT based britney - formalisation of the problem
Hi, Am Freitag, den 01.07.2011, 23:57 +0200 schrieb Luk Claes: > I won't stop anyone from experimenting with a SAT based solution, though > from a release point of view, I think it would be better to start using > britney2 and kill its bugs (which unfortunately will take some time > AFAICS) before diving into a SAT based adventure. maybe a first useful direction for SAT-deployment would be to have it generate hints for the transition it suggests. Unless I am mistaken, hints are checked by britney for correctness, so if the Conflict handling is insufficient, the hint becomes a no-op. But in the cases where no problem arises, the hint might already be a useful time-saver for the release team. Is “hint” or “easy” appropriate for this? And can one specify also individual binary packages (e.g. after a binNMU) for transition? http://release.debian.org/britney/hints/README mention only source-package/version tuples. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney - formalisation of the problem
Hi, there seems to be no urgend need for this system, hence we can take the time to collect more ideas and vary the design a bit: Am Mittwoch, den 29.06.2011, 17:55 +0200 schrieb Ralf Treinen: > The important question here would be to define what are precisely the > "interesting" installation requests. Candidates are: > - install one arbitrary single package A. This seems to be the definition > currently used in britney. > - install an arbitray *set* of packages together. This might be interesting > in some cases, but seems to be too restrictive. this is in interesting idea. Looking at each installable subset in testing and enforcing its installability after the migration sounds far too complex, but it leads to this (not fully thought through) ansatz¹ to be able to fully support Conflicts. The idea is to look at a partition of testing into installable subsets (or a covering by installable subsets). Assume we find such a partition, and their number is low (for some value of low). Then we can generate a PMAX-SAT instance that ensures that after transition, the same subsets are still co-installable. This could be explained less formally this way: We assume we have a (small) number of exemplary machines that run Testing, such that every package is installed on at least one of the machines. After the migration, each of these machines can fully upgrade to the new testing. The effect would be, for example, that for each MTA we have a different subset. All of Haskell, on the other side, can reside in one such “machine”. This way, a migration as found by the tool would always ensure installability in testing (as installability of the “machines” ensures installability of each individual package). This is an improvement over the previous idea of ignoring some Conflicts. The downside is that a transition might be prevented that would be allowed previously, most notable if a Conflict in introduced by new versions of A and B which are in in the same set. In that case, this fact needs to be made explicit by the tool and then the release team can look at the conflict and say: „This conflict may be in testing.“ Then the system would choose a different partition with A and B in different sets to start with and the Conflict would not prevent the migration any more. Once A and B are in testing, they ensure that a proper partition is chosen and the manual interaction can be forgotten. The handling of new packages needs to be addressed as well. In effect, the SAT problem would solve a distcheck-problem per each “machine” × each arch, in addition to the migration-related constraints connecting these. Whether this is still feasible performancewise is not clear. Also it is not clear whether it is better to have smaller or larger (and fewer) sets, or maybe one very large for everything non-conflicting and a few small ones for conflicting stuff. Greetings, Joachim ¹ I’m on a train, so no dict.leo.org available. Plus, I sometimes do read “ansatz” in English literature. Might have been from the beginning of the last century, though :-) -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney - formalisation of the problem
On Fri, 01 Jul 2011, Luk Claes wrote: > That's already an option in britney2. >From what I understood the auto-hinter is limited to a few simple cases. > This SAT based design seems overcomplex due to all the special casing > AFAICT. What special casing? > I also don't buy that a Conflicts relationship should be special > instead of a plain one like now. They are special only in the sense that they can't be represented as part of the SAT problem and thus need to be taken into account while creating the SAT problem to hand over to a SAT solver. > I won't stop anyone from experimenting with a SAT based solution, though > from a release point of view, I think it would be better to start using > britney2 and kill its bugs (which unfortunately will take some time > AFAICS) before diving into a SAT based adventure. They are not mutually exclusive. :) Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110703101253.gj4...@rivendell.home.ouaza.com
Re: SAT based britney - formalisation of the problem
On 07/01/2011 10:25 PM, Raphael Hertzog wrote: > Hi, > > On Wed, 29 Jun 2011, Ralf Treinen wrote: >> In fact, from out point of view the DIMACS format or MAX-SAT input >> format are already a specific encoding technique, and we think that >> one should first find a logical specification of what exactly one >> tries to achieve, before thinking about a specific encoding into >> MAX-SAT or whatever other solver technology. > > But we're looking into having some concrete prototype in the short term > and I don't think that this kind of formalization will help us in that > regard. And I don't really see the expected benefits of this approach > either... > >> Another element of the precise specification would be: one wants to >> have a maximal solution. What precisely is the sense of maximality here? >> Maximal number of binary packages? Maximal number of source packages? Should >> there be a way to give more weight to more "important" packages? > > It would be nice to take the popularity into account indeed and give them > priority in terms of migration. > > But really this is just a cherry on the top. If we already get something > working that gives a coherent set of package that can move without manual > hints, it would be great. That's already an option in britney2. This SAT based design seems overcomplex due to all the special casing AFAICT. I also don't buy that a Conflicts relationship should be special instead of a plain one like now. I won't stop anyone from experimenting with a SAT based solution, though from a release point of view, I think it would be better to start using britney2 and kill its bugs (which unfortunately will take some time AFAICS) before diving into a SAT based adventure. Cheers Luk -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4e0e42b6.2070...@debian.org
Re: SAT based britney - formalisation of the problem
Hi, On Wed, 29 Jun 2011, Ralf Treinen wrote: > In fact, from out point of view the DIMACS format or MAX-SAT input > format are already a specific encoding technique, and we think that > one should first find a logical specification of what exactly one > tries to achieve, before thinking about a specific encoding into > MAX-SAT or whatever other solver technology. But we're looking into having some concrete prototype in the short term and I don't think that this kind of formalization will help us in that regard. And I don't really see the expected benefits of this approach either... > Another element of the precise specification would be: one wants to > have a maximal solution. What precisely is the sense of maximality here? > Maximal number of binary packages? Maximal number of source packages? Should > there be a way to give more weight to more "important" packages? It would be nice to take the popularity into account indeed and give them priority in terms of migration. But really this is just a cherry on the top. If we already get something working that gives a coherent set of package that can move without manual hints, it would be great. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110701202530.gb5...@rivendell.home.ouaza.com
Re: SAT based britney - handling conflicts
Hi, On Wed, 29 Jun 2011, Ralf Treinen wrote: > > 2/ conflicts for files that have moved between binary packages of > >different sources, policy requires the conflicts/breaks to be > >versionned, and the problem can be solved prior to the definition of > >the SAT problem by adding a supplementary rule that forces the upgrade > >of both source packages together > > I see two problems here: > > 1) lets call A and B the two binary packages between whom the file was > moved. Why should we add a constraint that A and B must migrate > together? Maybe A is not ready to migrate for any other reason, so > this would stop B from migrating. It might not be perfect, but IMO it's not a bad rule. When a file is moved, you really want to upgrade both sides at the same time if you have both installed. And this even if there's no other package that requires both packages to be installed at the same time. And the same logic holds with Breaks imo. > It is true that migrating B without migrating A would create a conflict > in testing between A and B, but maybe this conflict was already there > before, This is something we can and should detect indeed. > or maybe this new conflict is nothing we should worry about. I think > that in general it would be much too restrictive to require that, if a > set {A,B,C} was installable together before the migration, then the same > should hold after the migration. Expressed that way it seems stronger than what I said, but it might just be an impression. > One way of approaching this question is: we should put a requirement > that A and B must migrate together only in case there is a third > package, say C, that depends both on A and B. > > 2) However, this leads us to another problem that is related to an > example by Raphael early in this thread (this was the example with > disjunctive dependendencies in [1]): It is not so clear what it means > for a package to depend on another package since dependencies possibly > are not local to the package (like: C declares directly a dependency > on A and B) but may be indirect, like in any of the following > situations: I'm happy to consider disjunctive dependencies like conjunctive ones for the specific purpose of deciding whether a given package depends both on A and B. If the only result is that we force the migration of 2 packages together that might not have required it, it's not big deal IMO. > > 3/ permanent conflict for a service provided, the various conflicting > >packages are usually alternatives in dependencies (via virtual > >packages) and as such there's very few dependency trees that would > >force to have two conflicting packages together > > This is interesting. If we had a way of distinguishing this case 3/ > from 4/ below then we could implement a check of the archive that > detects whether some package strongly depends (in the above sense) on > two packages that are in a type-3 conflict. One special case would be > relatively easy to detect: C strongly depends on A,B, and both A and B > do a "provides: v" and "conflicts: v". Maybe we should add this check to > edos.debian.net :-) You could but I doubt we have such trivial cases. We could also review all the dependencies on real packages that provide & conflict on a virtual package. > Yes, indeed, it would be very useful to have this kind of justification > of a conflict. IMHO one can say that almost always dependencies are from > upstream and conflicts are from the package maintainer, so (s)he should > be able to give such a justification. A machine-readable format for this > would be extra good. Much like some "upgrade hints" could be helpful (i.e. a way to formalise the expected result on other packages). Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110701202121.ga5...@rivendell.home.ouaza.com
Re: SAT based britney - handling conflicts
Hi, On Thu, Jun 23, 2011 at 10:25:55AM +0200, Raphael Hertzog wrote: > You're suggesting that one way to legitimately ignore the problem of > Conflicts would be to have the Debian policy only allow usage of Conflicts > for cases that cannot create problems during testing migration. Is that > right ? This is one reason, but it would also work the other way round: if we had a way to say that certain kinds of conflicts between packages must not occur then we should put it into policy so that we have a precise rule for telling problematic packages (in case the criterion is certain), or at least implement some check, using the edos tools for instance, to get a list of problematic cases that then can be investigated by hand. > I'm not sure how you can formally define whether a Conflicts is likely to > create migrations problems or not. I don't know either ATM how to do it. Anyway, your classification below helps a lot in this discussion: > However we know the common use cases for Conflicts and it seems to me that > they are ok: > > 1/ conflict for files that have moved within binary packages of the same >source, policy requires the conflicts/breaks to be versionned, and the >simple rule that requires all the binary packages of the source to >be moved together makes it moot Agreed. > 2/ conflicts for files that have moved between binary packages of >different sources, policy requires the conflicts/breaks to be >versionned, and the problem can be solved prior to the definition of >the SAT problem by adding a supplementary rule that forces the upgrade >of both source packages together I see two problems here: 1) lets call A and B the two binary packages between whom the file was moved. Why should we add a constraint that A and B must migrate together? Maybe A is not ready to migrate for any other reason, so this would stop B from migrating. It is true that migrating B without migrating A would create a conflict in testing between A and B, but maybe this conflict was already there before, or maybe this new conflict is nothing we should worry about. I think that in general it would be much too restrictive to require that, if a set {A,B,C} was installable together before the migration, then the same should hold after the migration. One way of approaching this question is: we should put a requirement that A and B must migrate together only in case there is a third package, say C, that depends both on A and B. 2) However, this leads us to another problem that is related to an example by Raphael early in this thread (this was the example with disjunctive dependendencies in [1]): It is not so clear what it means for a package to depend on another package since dependencies possibly are not local to the package (like: C declares directly a dependency on A and B) but may be indirect, like in any of the following situations: example a) Here, C logically depends on A and B, but this seen only when following dependency chains: Package: C Depends: E, F Package: E Depends: A Package: F Depends: B Package: A Conflicts B example b) Here, C logically depends on A and B, despite the fact that there is a disjunction involved. This is due to the fact that one of the branches of the disjunction, that is G, is invalidated. Package: C Depends: E, F|G Package: E Depends: A Package: F Depends: B Package: G Conflicts: A Package: B Conflicts: A For added fun, in any of the two examples the dependency from E to A may exist only in testing and not in unstable, or the other way round. That is, dependencies between packages may be indirect (we call these "strong dependencies" [2]). In this case they are relative to the repository, and not an isolated property of the package alone. > 3/ permanent conflict for a service provided, the various conflicting >packages are usually alternatives in dependencies (via virtual >packages) and as such there's very few dependency trees that would >force to have two conflicting packages together This is interesting. If we had a way of distinguishing this case 3/ from 4/ below then we could implement a check of the archive that detects whether some package strongly depends (in the above sense) on two packages that are in a type-3 conflict. One special case would be relatively easy to detect: C strongly depends on A,B, and both A and B do a "provides: v" and "conflicts: v". Maybe we should add this check to edos.debian.net :-) > 4/ permament conflict for a random file conflict that has not yet been >solved by renaming one side. This should be temporary. On Thu, Jun 23, 2011 at 01:06:44PM +0200, Joachim Breitner wrote: > A clean solution would be to require package maintainers to be specific > in their Conflicts, and have different fields for „This package should > generally not be installed along the other package.“ and „There is > something temporary wrong with the combination of this package and the > other package (usually with some versi
Re: SAT based britney - formalisation of the problem
Hi again, On Thu, Jun 23, 2011 at 01:06:44PM +0200, Joachim Breitner wrote: > > But according to our experience, for it to happen > > the problem should not be underestimated and, more importantly, should > > be properly formalized in a way which is independent of some specific > > encoding technique. > > Correct. That is why I suggested to use the format already used in a > MAX-SAT competition: > http://maxsat.ia.udl.cat/requirements/ In fact, from out point of view the DIMACS format or MAX-SAT input format are already a specific encoding technique, and we think that one should first find a logical specification of what exactly one tries to achieve, before thinking about a specific encoding into MAX-SAT or whatever other solver technology. One way of formalising this could be stated in the following style: 1) one defines first a certain abstraction of a distribution (like ignoring certain kinds of conflicts). This has of course to be exactly defined. Lets call this function a(). 2) lets say that the result of migration a set m to a testing distribution t is denoted as "t+m" 3) then you say: you look for a migration set m, such that for the testing distribution t: every "interesting" installation request that is satisfiable in a(t) is also satisfiable in a(t+m) Ideally, a() would be the identity function, that is no abstraction at all. The important question here would be to define what are precisely the "interesting" installation requests. Candidates are: - install one arbitrary single package A. This seems to be the definition currently used in britney. - install an arbitray *set* of packages together. This might be interesting in some cases, but seems to be too restrictive. Another element of the precise specification would be: one wants to have a maximal solution. What precisely is the sense of maximality here? Maximal number of binary packages? Maximal number of source packages? Should there be a way to give more weight to more "important" packages? Cheers -Zack and Ralf. -- Ralf Treinen Laboratoire Preuves, Programmes et Systèmes Université Paris Diderot, Paris, France. http://www.pps.jussieu.fr/~treinen/ -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110629155524.gb9...@pps.jussieu.fr
Re: SAT based britney
Hi, On Fri, Jun 24, 2011 at 08:26:28AM +0200, Raphael Hertzog wrote: > On Thu, 23 Jun 2011, Mehdi Dogguy wrote: > > Yes, but zack's answer confirms that (at least for me) we won't have any > > better solutions because testing migration problem isn't a trivial > > problem. So, personally, I remain not convinced that a new SAT solver > > would make things easier or better for us. We will have to enter "hints" > > (no matter what implementation is used) to make choices the solver can't > > infer automatically. (as julien said in the beginning of this thread). > Well, from my understanding the SAT solver would drop the need to provide > "easy" hints. "force" hints where you deliberately break dependencies > would require some input to tell the solver which constraints can be > broken. FWIW b2 (in non-compatible mode, i.e. not compatible with b1) tries to automatically hint some circular dependencies with easy before the main run: | def auto_hinter(self): | """Auto hint circular dependencies | | This method tries to auto hint circular dependencies analyzing the update | excuses relationships. If they build a circular dependency, which we already | know as not-working with the standard do_all algorithm, try to `easy` them. | """ | self.__log("> Processing hints from the auto hinter", type="I") | | # consider only excuses which are valid candidates | excuses = dict([(x.name, x) for x in self.excuses if x.name in self.upgrade_me]) | | def find_related(e, hint, first=False): | if e not in excuses: | return False | excuse = excuses[e] | if e in self.sources['testing'] and self.sources['testing'][e][VERSION] == excuse.ver[1]: | return True | if not first: | hint[e] = excuse.ver[1] | if len(excuse.deps) == 0: | return hint | for p in excuse.deps: | if p in hint: continue | if not find_related(p, hint): | return False | return hint | | # loop on them | cache = [] | for e in excuses: | excuse = excuses[e] | if e in self.sources['testing'] and self.sources['testing'][e][VERSION] == excuse.ver[1] or \ |len(excuse.deps) == 0: | continue | hint = find_related(e, {}, True) | if hint and e in hint and hint not in cache: | self.do_hint("easy", "autohinter", hint.items()) | cache.append(hint) It should help with the easy cases like shorewall. I don't know how well it works with others. And the switch will be to b2's compatible mode first, with b1 running in parallel and diffed afterwards. I'm mainly stating this for completeness and awareness. Not as any input whatsoever on the SAT idea. Kind regards, Philipp Kern -- .''`. Philipp KernDebian Developer : :' : http://philkern.de Stable Release Manager `. `' xmpp:p...@0x539.de Wanna-Build Admin `-finger pkern/k...@db.debian.org signature.asc Description: Digital signature
Re: SAT based britney
Hi, On Thu, 23 Jun 2011, Mehdi Dogguy wrote: > Yes, but zack's answer confirms that (at least for me) we won't have any > better solutions because testing migration problem isn't a trivial > problem. So, personally, I remain not convinced that a new SAT solver > would make things easier or better for us. We will have to enter "hints" > (no matter what implementation is used) to make choices the solver can't > infer automatically. (as julien said in the beginning of this thread). Well, from my understanding the SAT solver would drop the need to provide "easy" hints. "force" hints where you deliberately break dependencies would require some input to tell the solver which constraints can be broken. But it's still a big win IMO. > Having that said, what might help us is to optimize britney and fix some > corner cases that could be handled in a better way. Maybe start by documenting them with wishlist bug reports against release.debian.org ? Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110624062628.gf27...@rivendell.home.ouaza.com
Re: SAT based britney
On 23/06/2011 15:34, Raphael Hertzog wrote: > > I don't think this is true. > > I saw notes taken during the last release team saying that they were > interested in the concept. And it's Mehdi who asked Ralf and Zack to give > their feedback because they wanted to gather more information on the idea. > Yes, but zack's answer confirms that (at least for me) we won't have any better solutions because testing migration problem isn't a trivial problem. So, personally, I remain not convinced that a new SAT solver would make things easier or better for us. We will have to enter "hints" (no matter what implementation is used) to make choices the solver can't infer automatically. (as julien said in the beginning of this thread). Having that said, what might help us is to optimize britney and fix some corner cases that could be handled in a better way. At least, that's my own reading of this thread. Other members in the Release Team can have different opinion on the subject. Regards, -- Mehdi Dogguy مهدي الدڤي mehdi@{dogguy.org,debian.org} -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4e034537.4090...@debian.org
Re: SAT based britney
Hi, Am Donnerstag, den 23.06.2011, 15:34 +0200 schrieb Raphael Hertzog: > > My impression is that the release team is happy with their current tools > > and that the new britney2 is meant to improve the things that are not so > > good today, so maybe a SAT-based britney would not be so great after > > all. > > I don't think this is true. > > I saw notes taken during the last release team saying that they were > interested in the concept. And it's Mehdi who asked Ralf and Zack to give > their feedback because they wanted to gather more information on the idea. ah, glad to hear that. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney
Hi, On Thu, 23 Jun 2011, Joachim Breitner wrote: > arch:all are, besides Conflicts, another flaw in my proposal that I > recently discovered: First, the semantics is not clear to me (should it > be installable on all arches? one any arch? on one particular, as it is > now?). FWIW the current britney only considers installability of arch all on i386. I would suggest to mimick this for a start. > My impression is that the release team is happy with their current tools > and that the new britney2 is meant to improve the things that are not so > good today, so maybe a SAT-based britney would not be so great after > all. I don't think this is true. I saw notes taken during the last release team saying that they were interested in the concept. And it's Mehdi who asked Ralf and Zack to give their feedback because they wanted to gather more information on the idea. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110623133422.gc17...@rivendell.home.ouaza.com
Re: SAT based britney
On the other hand, if we have less conflicts than in a debcheck run, solutions might be easier to find, as in the absence of conflicts hardly any backtracking is needed and the complexity should go down considerably. > But according to our experience, for it to happen > the problem should not be underestimated and, more importantly, should > be properly formalized in a way which is independent of some specific > encoding technique. Correct. That is why I suggested to use the format already used in a MAX-SAT competition: http://maxsat.ia.udl.cat/requirements/ Actually, MAX-SAT is more that we need: I don’t think we really have to find the largest (by numbers) set of packages that can migrate, and inclusion-maximal set of packages should suffice. This might also take the complexity down. My impression is that the release team is happy with their current tools and that the new britney2 is meant to improve the things that are not so good today, so maybe a SAT-based britney would not be so great after all. Thanks for your feedback, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney
On 23/06/2011 10:25, Raphael Hertzog wrote: > Hi, > > On Wed, 22 Jun 2011, Stefano Zacchiroli wrote: >> The package installation problem (as routinely solved by the SAT-based >> edos-distcheck monitor [1]) is: find a subset of the repository that is >> coherent (satisfies all dependencies and conflicts) and contains the >> wanted package. This is a typical NP-complete problem, due to the >> presence of both disjunctive dependencies and conflicts [2]. >> >> [1] http://edos.debian.net/weather/ >> [2] http://www.computer.org/portal/web/csdl/doi/10.1109/ASE.2006.49 > > Can we have a copy of this article? > It's available on http://hal.inria.fr/hal-00149566_v1/ Regards, -- Mehdi Dogguy مهدي الدڤي mehdi@{dogguy.org,debian.org} -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/4e030c1a.5050...@debian.org
Re: SAT based britney
Hi, On Wed, 22 Jun 2011, Stefano Zacchiroli wrote: > The package installation problem (as routinely solved by the SAT-based > edos-distcheck monitor [1]) is: find a subset of the repository that is > coherent (satisfies all dependencies and conflicts) and contains the > wanted package. This is a typical NP-complete problem, due to the > presence of both disjunctive dependencies and conflicts [2]. > > [1] http://edos.debian.net/weather/ > [2] http://www.computer.org/portal/web/csdl/doi/10.1109/ASE.2006.49 Can we have a copy of this article? > This analysis, of course, relies on the assumption that package > dependencies can be arbitrarily complex, with inter-related conflicts > and alternative dependencies. As Joachim and others have already pointed > out, conflicts are not really taken into account in his proposal. Maybe > it is indeed possible to find a good approximation of the problem by > ignoring conflicts and doing some subsequent repair-step, by hand or > automatically. However, we have problems of understanding what exactly > the justification of such an approximation would be. Can we really make > any assumptions on what kind of conflicts would be RC bugs, and hence do > not have to be taken into account? If we can find such assumptions then > we should probably cast them into policy. How to deal with conflicts > seems to be a crucial point to evaluate the proposal in more depth. I'm not sure I underestand so I'll reformulate and you'll tell me if I got it right. You're suggesting that one way to legitimately ignore the problem of Conflicts would be to have the Debian policy only allow usage of Conflicts for cases that cannot create problems during testing migration. Is that right ? I'm not sure how you can formally define whether a Conflicts is likely to create migrations problems or not. However we know the common use cases for Conflicts and it seems to me that they are ok: 1/ conflict for files that have moved within binary packages of the same source, policy requires the conflicts/breaks to be versionned, and the simple rule that requires all the binary packages of the source to be moved together makes it moot 2/ conflicts for files that have moved between binary packages of different sources, policy requires the conflicts/breaks to be versionned, and the problem can be solved prior to the definition of the SAT problem by adding a supplementary rule that forces the upgrade of both source packages together 3/ permanent conflict for a service provided, the various conflicting packages are usually alternatives in dependencies (via virtual packages) and as such there's very few dependency trees that would force to have two conflicting packages together 4/ permament conflict for a random file conflict that has not yet been solved by renaming one side. This should be temporary. To mitigate the problem we could however do some special analysis on non-versionned conflicts before generating the SAT problem to see whether the situation of conflicting packages changed between testing and unstable: if yes we might want to do a first run that does not introduce new Conflicts in testing. > helping out with that. But according to our experience, for it to happen > the problem should not be underestimated and, more importantly, should > be properly formalized in a way which is independent of some specific > encoding technique. Can you be more explicit? Is DIMACS the "specific encoding technique" that you're referring to? If yes, how else should the problem be formalized? Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110623082555.ga9...@rivendell.home.ouaza.com
Re: SAT based britney
Hello, sorry for dropping in late in this discussion, but we learned only recently that this discussion was going on on debian-release. We have discussed a bit the proposal within the Mancoosi team in Paris; I will try to reply from the point of view of the team (hence, the "we"). In short: - we are generally in favor of using existing logical formalisms and resolver engines for this kind of problems. - however, we think that the testing migration problem is harder than SAT solving. The package installation problem (as routinely solved by the SAT-based edos-distcheck monitor [1]) is: find a subset of the repository that is coherent (satisfies all dependencies and conflicts) and contains the wanted package. This is a typical NP-complete problem, due to the presence of both disjunctive dependencies and conflicts [2]. [1] http://edos.debian.net/weather/ [2] http://www.computer.org/portal/web/csdl/doi/10.1109/ASE.2006.49 Even oversimplifying, the testing migration problem is: find a subset of the set of migration candidates such that for all packages x in testing after performing this migration there is a coherent subset of testing that contains x That, is the structure is "exists-forall-exists", instead of just "exist". This should yank the problem out of NP-complete into some higher level in the polynomial hierarchy. And this is letting aside the even harder problem of finding the "best" subset---as opposed as finding just one such subset---in accordance to the goals of the release team. This analysis, of course, relies on the assumption that package dependencies can be arbitrarily complex, with inter-related conflicts and alternative dependencies. As Joachim and others have already pointed out, conflicts are not really taken into account in his proposal. Maybe it is indeed possible to find a good approximation of the problem by ignoring conflicts and doing some subsequent repair-step, by hand or automatically. However, we have problems of understanding what exactly the justification of such an approximation would be. Can we really make any assumptions on what kind of conflicts would be RC bugs, and hence do not have to be taken into account? If we can find such assumptions then we should probably cast them into policy. How to deal with conflicts seems to be a crucial point to evaluate the proposal in more depth. Besides this principal scepticism some more specific remarks: - having all packages installable in testing is a noble goal, however during a release cycle it typically gets lost even in i386, and has to be restored by the release team before the release. Even then it is usually not satisfied on the less popular architectures, at least not for arch:all packages (see [3]). For this reason one probably would rather want to say something like: any package that was installable before the migration should also be installable after the migration. This also includes the constraint that the migration process should not remove packages from testing (unless intervention by the release team). Otherwise, a maximal solution could imply to migrate 2 packages to testing and to remove one. [3] http://edos.debian.net/edos-debcheck/stable.php - About efficiency: as Mark said, modern SAT solvers are extremly efficient. However, one should not forget that one has to consider all architectures in parallel, that is we need a solution that works on all achitectures. You would have at least one variable per package in testing, plus one per migration candidate plus variables for source packages and probably also for virtual packages. This would yield, as a very rough estimate, 50k variables, multiplied by 11 release architectures gives 550k variables. And that's a lot. All that considered, this mail does not want to be a dissuasion in trying this out. Our experience has shown that there are indeed communities of researchers and practitioners out there which are very much interested in trying their solvers against "real" problem instances coming from realities like Debian. For instance, the MISC competition we have set up [4] to attack the "package upgrade problem" on top of the CUDF format [5] has given great results in terms of new solvers and encoding techniques; such results are now flowing into Debian package managers [6]. It might be possible to setup a similar virtuous cycle that will challenge researchers to attack the testing migration problem and we are actually interested in helping out with that. But according to our experience, for it to happen the problem should not be underestimated and, more importantly, should be properly formalized in a way which is independent of some specific encoding technique. [4] http://www.mancoosi.org/misc-2011/ [5] http://www.mancoosi.org/cudf/primer/ [6] http://packages.debian.org/experimental/apt-cudf On behalf of the Mancoosi group in Paris, Ralf Treinen
Re: SAT based britney
[Resent, as l.d.o still does not like mails from my private address mail@...] Hi, On Thu, 26 May 2011 15:01:49 +0200, Joachim Breitner wrote: > Correct. MAX-SAT (or, more precise, PMAX-SAT) is what we want to use. BTW, here is a competition for PMAX-SAT, with a DIMACS-based input and some results: http://maxsat.ia.udl.cat/requirements/ specification http://www.maxsat.udl.cat/10/results/ last years result Didn't check yet which of these solvers are Free Software (I have checked other lists before and was disappointed how many of them do not share the source, hoping to make some bucks with it it seems), but anyways, I'd say this specifies our layer 1. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney
Hi, On Thu, 26 May 2011 14:03:17 +0200, Raphael Hertzog wrote: > On Thu, 26 May 2011, Joachim Breitner wrote: >> Am Donnerstag, den 26.05.2011, 11:58 +0200 schrieb Raphael Hertzog: >> The rationales for this separation: >> * Layer 1 needs to be fast, so eventually it is likely implemented in C >> using the picosat library. > > For now, I'm just calling the picosat executable. What improvement do you > expect by switching to the library? Remember that to actually implement PMAX-SAT in the iterative way I suggested earlier, we'd need to run picosat repeatedly. Using the picosat library might save some time there, and possibly make it easier to start the search from the current assignments to the atoms, which also might speed things up. But this is a minor detail and can be postboned to when we actually have to worry about performance and after the interfaces are fixed. > I guess it might be more convenient (or even required) for the part where > you try to get an answer to "why did this package not migrate". But I fear > that switching to the library means we're loosing a clear interface for > people who want to provide an alternative layer 1. At least we must be > careful to keep a reasonable abstraction on top of it. The interface for layer 1 is a dimacs file with CNF clauses (just like for picosat) + the set of desired atoms -- whether it is implemented as a python script calling picosat, or a C program linking to picosat does not matter. (I guess we still need to clarify our layering more, as there seems to be some confusion). >> * Implementing the layer 1 interface might be interesting for >> non-Debian-related research parties whose result might be faster than >> anything we come up with. They will likely not want to worry about our >> human-readable syntax. > > Somehow this is already the case since we can use any SAT solver provided > we express our problem with a DIMACS file. Or do you expect them to work > specifically on the MAX-SAT problem that we want to implement on top of a > normal SAT solver? Correct. MAX-SAT (or, more precise, PMAX-SAT) is what we want to use. Our implementation based on SAT is just to get something working quickly, but I suspect that by tackling PMAX-SAT directly, better performance can be achieved. > Speaking of languages, I picked python because britney2 is in python, > because we have python-apt, and several members of the release team > seem to know/use/appreciate python. > > I know Perl & C too, but I'm afraid I don't know Ocaml & Haskell. > > I am aware that Ocaml could have been interesting due to its dose3 library. Of course there is no _right_ answer. One more reason for good layering: Assume layer 1 in C (due to speed), layer 2 in Haskell (due to the ease it manipulates expression trees andits good parsing libraries) and layer 3 in python (good debian-specific libraries and here is the code that the release team might want to modify if some constraints change, or additional suites are to be considered). Greetings, Joachim -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/06c6984bb27f34b3fd028c75bc9e8c73@localhost
Re: SAT based britney
On Thu, 26 May 2011, Joachim Breitner wrote: > Am Donnerstag, den 26.05.2011, 11:58 +0200 schrieb Raphael Hertzog: > > We could instead include it as a "post-process filter". We'd run > > edos-distcheck on the result to identify what new packages > > are uninstallable, and from this we could try to identify the > > supplementary constraint to add. > > with identify, do you mean manually (which is also what I am suggesting) > or automatically? Automatically if possible. > Automatically might be difficult. One approach would be to forcibly > prevent any afterwards non-installable package from migrating, and then > re-run sat-britney. The non-installability can be caused on a package that has not migrated by the migration of another package. So it needs to be more subtle than that. > But even then the problem needs to be tackled manually later. Or maybe > the situation is left until the problem dissolves by itself. Yes, something like that. Either the non-migration causes troubles and I expect the release managers/maintainer to investigate, or it doesn't cause troubles and it might go away alone at some point. > Ah, ok. So this function would actually be used in the next layer (the > one that takes the readable constraints and feeds them to the > MAX-SAT-solver). Yes. > The rationales for this separation: > * Layer 1 needs to be fast, so eventually it is likely implemented in C > using the picosat library. For now, I'm just calling the picosat executable. What improvement do you expect by switching to the library? I guess it might be more convenient (or even required) for the part where you try to get an answer to "why did this package not migrate". But I fear that switching to the library means we're loosing a clear interface for people who want to provide an alternative layer 1. At least we must be careful to keep a reasonable abstraction on top of it. > * Implementing the layer 1 interface might be interesting for > non-Debian-related research parties whose result might be faster than > anything we come up with. They will likely not want to worry about our > human-readable syntax. Somehow this is already the case since we can use any SAT solver provided we express our problem with a DIMACS file. Or do you expect them to work specifically on the MAX-SAT problem that we want to implement on top of a normal SAT solver? > * Layer 2 does string processing, so is likely not a good fit for C and > rather something like Perl, Python or Haskell. This is ok as it is not > doing the actually intensive calculations. > * Layer 2 is still generic, so might be used (and thus tested and > improved) by others. This more or less maximizes the non-Debian-specific > part. > * If it turns out that layer 3 and layer 2 are implemented in the same > language (e.g. Ocaml, given the existing Debian libraries, or Haskell), > it should be possible to combine them in one executable that does not > actually serialize, write out, write and and parse the constraints but > pass them around in their abstract form directly, thus saving time and > space. Speaking of languages, I picked python because britney2 is in python, because we have python-apt, and several members of the release team seem to know/use/appreciate python. I know Perl & C too, but I'm afraid I don't know Ocaml & Haskell. I am aware that Ocaml could have been interesting due to its dose3 library. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110526120317.ga3...@rivendell.home.ouaza.com
Re: SAT based britney
Hi, Am Donnerstag, den 26.05.2011, 11:58 +0200 schrieb Raphael Hertzog: > On Tue, 24 May 2011, Joachim Breitner wrote: > That said, I'm not sure we need to deal with it from the start at the SAT > solver level. > > We could instead include it as a "post-process filter". We'd run > edos-distcheck on the result to identify what new packages > are uninstallable, and from this we could try to identify the > supplementary constraint to add. with identify, do you mean manually (which is also what I am suggesting) or automatically? Automatically might be difficult. One approach would be to forcibly prevent any afterwards non-installable package from migrating, and then re-run sat-britney. But even then the problem needs to be tackled manually later. Or maybe the situation is left until the problem dissolves by itself. > Joachim wrote: > > Am Mittwoch, den 25.05.2011, 12:06 +0200 schrieb Raphael Hertzog: > > > Next I want to add some functions like add_implication(a, b) > > > that would convert the common boolean constraints the we want > > > to handle to CNF. > > > > Hmm, not sure if this should be added at this layer. I’d rather have > > implemented the MAX-SAT part on top of picosat using a minimal, CNF > > syntax. This keeps this part simple and easy to replace with more > > powerful implementations. > > I don't see what it changes here. The DIMACS object stores all the > information in plain CNF format. The add_implication() is just a > convenience function to feed the data. Ah, ok. So this function would actually be used in the next layer (the one that takes the readable constraints and feeds them to the MAX-SAT-solver). > For now it's not clear to me what your next layer is. Here is my three-layering-idea, concisely put: 1. A max-sat-solver, with input and output in DIMACS format. 2. A still generic wrapper that takes human-readable clauses and has support for things like "A implies one of B C D", "at most one of A B C", "not C" as constraints. When you replace A, B,... by atoms like ghc_7.0.3-1_amd64/testing, then these constrains become intelligible. 3. The Debian specific part, which takes the Packages/Sources and generates the readable constraints, and further interprets the result to pass it to dak. The rationales for this separation: * Layer 1 needs to be fast, so eventually it is likely implemented in C using the picosat library. * Implementing the layer 1 interface might be interesting for non-Debian-related research parties whose result might be faster than anything we come up with. They will likely not want to worry about our human-readable syntax. * Layer 2 does string processing, so is likely not a good fit for C and rather something like Perl, Python or Haskell. This is ok as it is not doing the actually intensive calculations. * Layer 2 is still generic, so might be used (and thus tested and improved) by others. This more or less maximizes the non-Debian-specific part. * If it turns out that layer 3 and layer 2 are implemented in the same language (e.g. Ocaml, given the existing Debian libraries, or Haskell), it should be possible to combine them in one executable that does not actually serialize, write out, write and and parse the constraints but pass them around in their abstract form directly, thus saving time and space. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney
Hi, On Tue, 24 May 2011, Joachim Breitner wrote: > If cases where a conflict (of the non-versioned-kind, i.e. a conflict > only with regard to installability and not with regard to presence in an > suite) has an effect on the migrateability occur often, then this poses > indeed a serious problem. Can we still somehow map this to a SAT > problem? > > The complete solution would involve separate logical atoms for migrating > a package (foo_1234/testing) and for checking installability > (foo_1234/testing/installable_for_foo_1234, > bar_987/testing/installable_for_foo_1234). But for the installability of > each package, the installability of the dependencies have to be solved > independed, causing a quadratic explosion of atoms. Clearly not > possible. Agreed that it's not possible that way. And I don't see any other way to express it as a SAT problem. That said, I'm not sure we need to deal with it from the start at the SAT solver level. We could instead include it as a "post-process filter". We'd run edos-distcheck on the result to identify what new packages are uninstallable, and from this we could try to identify the supplementary constraint to add. --- On another note, I have started some early coding to play with the concept/idea. You can grab it here: git clone git://anonscm.debian.org/~hertzog/sat-britney.git http://anonscm.debian.org/gitweb/?p=users/hertzog/sat-britney.git You can do simple stuff like this: $ python >>> import sat >>> dimacs = sat.DIMACS() >>> dimacs.init_variable("a", False) >>> dimacs.add_implication("!a", "b") >>> dimacs.add_implication("b", "c") >>> s = sat.Solver() >>> s.run(dimacs) (True, {'a': False, 'c': True, 'b': True}) >>> dimacs.init_variable("c", False) >>> s.run(dimacs) (False, None) I shared it to Joachim earlier and he asked me to continue the discussion here. So that's what I'm doing now. Joachim wrote: > Am Mittwoch, den 25.05.2011, 12:06 +0200 schrieb Raphael Hertzog: > > Next I want to add some functions like add_implication(a, b) > > that would convert the common boolean constraints the we want > > to handle to CNF. > > Hmm, not sure if this should be added at this layer. I’d rather have > implemented the MAX-SAT part on top of picosat using a minimal, CNF > syntax. This keeps this part simple and easy to replace with more > powerful implementations. I don't see what it changes here. The DIMACS object stores all the information in plain CNF format. The add_implication() is just a convenience function to feed the data. > Having more readable operators such as implication, at-most-one and so > on should be the responsibility of the next layer. At least in my > design. It's really early, I have not put much thoughts on the precise design of the objects, I add stuff as I need them. So I'm not attached to the design... feel free to make suggestions, preferrably in the form of patches. :) For now it's not clear to me what your next layer is. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110526095801.ga32...@rivendell.home.ouaza.com
Re: SAT based britney
Hallo, Am Montag, den 16.05.2011, 20:17 +0200 schrieb Julien Cristau: > On Mon, May 16, 2011 at 19:03:35 +0530, Joachim Breitner wrote: > > Am Montag, den 16.05.2011, 11:44 +0200 schrieb Raphael Hertzog: > > > On Sun, 15 May 2011, Joachim Breitner wrote: > > > > And in > > > > the absence of conflicts, this means that A and B are co-installable. > > > > > > Yes, but what about this, T is the package we're considering to migrate. > > > > > > T depends on A, B > > > A depends on D1 | D2 > > > B depends on D2 | D3 > > > D1 conflicts with D3 > > > D2 is not satisfiable/installable in testing (but is in unstable) > > > D1 is installable alone in testing > > > D3 is installable alone in testing > > > There is a conflict, so I’m not claiming to handle this perfectly. I > > hope such cases are rare. And note that my system allows for manual > > addition of rules, so if we come across situations that are treated > > insufficiently, and such situations are rare enough, additional > > constraints can be added by the RM team. E.g. in this case, after some > > thought, the additional constraint "T implies D2" should be sufficient. > > > Not sure I'd want to replace one manual task (adding hints for packages > which need to migrate together) with another (handling conflicts). I > have no precise idea how common that situation is; I fear it's more > common than you seem to think. > > I like the part where we wouldn't have to keep maintaining much of this > code though... If cases where a conflict (of the non-versioned-kind, i.e. a conflict only with regard to installability and not with regard to presence in an suite) has an effect on the migrateability occur often, then this poses indeed a serious problem. Can we still somehow map this to a SAT problem? The complete solution would involve separate logical atoms for migrating a package (foo_1234/testing) and for checking installability (foo_1234/testing/installable_for_foo_1234, bar_987/testing/installable_for_foo_1234). But for the installability of each package, the installability of the dependencies have to be solved independed, causing a quadratic explosion of atoms. Clearly not possible. Back to the concrete example: Maybe the situation should be manually detected _once_ and then handled by asking T to depend on D2 (after all, T will not be installable without D2). Or, if this is not something you want to impose on the maintainer, some kind of permanent overwrite could be passed to SAT-britney with the same effect. Still, every case needs to be handled manually, but at least only once. But it’s up to you (as in the release team) to decide what you’d rather do: Figure out conflict problems manually (or just let them migrate to testing and handle them later), or write hints for large transition. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney
On Mon, May 16, 2011 at 19:03:35 +0530, Joachim Breitner wrote: > Hi, > > Am Montag, den 16.05.2011, 11:44 +0200 schrieb Raphael Hertzog: > > On Sun, 15 May 2011, Joachim Breitner wrote: > > > And in > > > the absence of conflicts, this means that A and B are co-installable. > > > > Yes, but what about this, T is the package we're considering to migrate. > > > > T depends on A, B > > A depends on D1 | D2 > > B depends on D2 | D3 > > D1 conflicts with D3 > > D2 is not satisfiable/installable in testing (but is in unstable) > > D1 is installable alone in testing > > D3 is installable alone in testing > > > > That said, I don't know if the current britney would detect anything wrong > > here either. > It does. > There is a conflict, so I’m not claiming to handle this perfectly. I > hope such cases are rare. And note that my system allows for manual > addition of rules, so if we come across situations that are treated > insufficiently, and such situations are rare enough, additional > constraints can be added by the RM team. E.g. in this case, after some > thought, the additional constraint "T implies D2" should be sufficient. > Not sure I'd want to replace one manual task (adding hints for packages which need to migrate together) with another (handling conflicts). I have no precise idea how common that situation is; I fear it's more common than you seem to think. I like the part where we wouldn't have to keep maintaining much of this code though... Cheers, Julien -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110516181713.gb2...@radis.liafa.jussieu.fr
Re: SAT based britney
Hi, Am Montag, den 16.05.2011, 11:44 +0200 schrieb Raphael Hertzog: > On Sun, 15 May 2011, Joachim Breitner wrote: > > And in > > the absence of conflicts, this means that A and B are co-installable. > > Yes, but what about this, T is the package we're considering to migrate. > > T depends on A, B > A depends on D1 | D2 > B depends on D2 | D3 > D1 conflicts with D3 > D2 is not satisfiable/installable in testing (but is in unstable) > D1 is installable alone in testing > D3 is installable alone in testing > > That said, I don't know if the current britney would detect anything wrong > here either. There is a conflict, so I’m not claiming to handle this perfectly. I hope such cases are rare. And note that my system allows for manual addition of rules, so if we come across situations that are treated insufficiently, and such situations are rare enough, additional constraints can be added by the RM team. E.g. in this case, after some thought, the additional constraint "T implies D2" should be sufficient. > > Yes, I had things like this in mind. Also, it should be able to handle > > multi-way-transitions between several suits > > (unstable/testing/cut/volatile/stable-updates) with invariants that > > should hold across different suites (e.g. a package in cut may not be > > older than the one in testing) in one run. That is why I have added the > > suffix /testing in the proposed syntax. > > I'd rather start with small objectives and build up if it works well. :-) > > I started looking up some of the literature around SAT solver. I suppose > you are more familiar with the topic than me. > > Did you have some ideas of how to implement all this? > > We have quite some SAT solvers in Debian and they all seem to support > the DIMACS format as input (which requires to transform all boolean > constraints to CNF). But they give a solution or they return > unsatisfiable. > > You mentioned the possibility to do several "iterations" to implement a > MAX-SAT > solver. Can you elaborate? Here is my proposal of using a regular SAT solver to solve PMAX-SAT in the way we want, with an additional overhead linear in the number of desired atoms (not great, but maybe still good enough in practice): Run MAX solver once and get at one satisfiable set (this should exist, in the worst case it is migrate nothing. If it does not, then testing is broken and the RM team has to remove some constraints, e.g. for a broken package, the constraint that at least some version of a package has to exist. This has the effect of allowing it to be dropped from testing if needed). Then take all desired atoms (i.e. migrations) set to true, and add them as axioms to the set of clauses. Take one atom not set to true, and also force it true. If satisfiable, we know this can migrate and we repeat with the next atom. This might set other desired atoms to true, for example if the package was part of a larger migration that the initial run did no migrate – great, less work to do. Try setting the next desired atom to true. If not satisfiable, we get a minimal unsatisfiable core. This is the explanation why this package cannot be migrate. In further runs we set this atom to false (which probably only has the effect of speeding things up) and try setting the next desired atom to true. If the SAT solver can be tricked, e.g. by default assignments, picosat has such an option, to set a lot of desired atoms to true in the first run, the number of iterations might be low. If this turns out to be insufficient, we will need help from theoretical computer scientists. But my plan would be to implement this DIMACS-based PMAX-solver algorithm with a DIMACS-like interface to have something to work with, and then improvements below this layer can be implemented independent of the rest of the sytem. Greetings, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description: This is a digitally signed message part
Re: SAT based britney
Hi, On Sun, 15 May 2011, Joachim Breitner wrote: > No, that should be no problem. Versioned constraints are resolved > precisely to the list of existing concrete packages which fulfill the > requirement before the resulting constraint being passed to the > SAT-solver, so no irregularities there. As long as the invariant that > for each binary package, there is exactly one version in testing, is not > overwritten manually then the migratability of A and B means A is > installable and B is installable using the same set of packages. And in > the absence of conflicts, this means that A and B are co-installable. Yes, but what about this, T is the package we're considering to migrate. T depends on A, B A depends on D1 | D2 B depends on D2 | D3 D1 conflicts with D3 D2 is not satisfiable/installable in testing (but is in unstable) D1 is installable alone in testing D3 is installable alone in testing That said, I don't know if the current britney would detect anything wrong here either. > Yes, I had things like this in mind. Also, it should be able to handle > multi-way-transitions between several suits > (unstable/testing/cut/volatile/stable-updates) with invariants that > should hold across different suites (e.g. a package in cut may not be > older than the one in testing) in one run. That is why I have added the > suffix /testing in the proposed syntax. I'd rather start with small objectives and build up if it works well. :-) I started looking up some of the literature around SAT solver. I suppose you are more familiar with the topic than me. Did you have some ideas of how to implement all this? We have quite some SAT solvers in Debian and they all seem to support the DIMACS format as input (which requires to transform all boolean constraints to CNF). But they give a solution or they return unsatisfiable. You mentioned the possibility to do several "iterations" to implement a MAX-SAT solver. Can you elaborate? Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110516094434.gf7...@rivendell.home.ouaza.com
Re: SAT based britney
Hi, Am Freitag, den 13.05.2011, 22:17 +0200 schrieb Raphael Hertzog: > On Wed, 11 May 2011, Joachim Breitner wrote: > > * Do you think it can handle all current and anticipatable > > requirements? (I’m not involved in release team work and might missed > > some conditions.) > > You mention that the tool can't deal properly with Conflicts for ensuring > installability. I wonder if there aren't more similar problems related > to this requirement of keeping package installable. > > For instance, I wonder how you'd deal with alternative dependencies? I can > imagine that you can find out solutions where each package can be > individually installed but where both sets of dependencies are not > co-installable (in particular if you solve some dependencies with > different versions of the same binary package, and here Conflicts is not > involved). No, that should be no problem. Versioned constraints are resolved precisely to the list of existing concrete packages which fulfill the requirement before the resulting constraint being passed to the SAT-solver, so no irregularities there. As long as the invariant that for each binary package, there is exactly one version in testing, is not overwritten manually then the migratability of A and B means A is installable and B is installable using the same set of packages. And in the absence of conflicts, this means that A and B are co-installable. > > * What do you think of the advantages and improvements that I point > > out? > > To me it looks great but I have no experience with SAT solver and I > wonder whether the sheer amount of predicates that it will have to handle > will not result in something dog-slow. That is indeed a problem. I have heard impressive results for todays SAT-solvers (given the NP-hardness of the problem), but it is hard to predict for me. I’d say that given the reduced number of Conflicts (which make things harder) there is a chance that it will not take longer than a full edos-distcheck run. > > The system is three-layered. On the lowest level is a general purpose > > MAX-SAT-solver. It takes as input a propositional formula in conjunctive > > normal form with some variables, and a list of “desired” variables. It > > Presented that way, it corresponds exactly to what is needed to implement > the rolling distributions described in > http://lists.debian.org/debian-devel/2011/05/msg00275.html > > Maybe it can be a good testbed for the concept. Yes, I had things like this in mind. Also, it should be able to handle multi-way-transitions between several suits (unstable/testing/cut/volatile/stable-updates) with invariants that should hold across different suites (e.g. a package in cut may not be older than the one in testing) in one run. That is why I have added the suffix /testing in the proposed syntax. > > * a satisfying assignment which is maximal, i.e. there is no satisfying > > assignment where a strict superset of desired variables is true. > > FWIW, Britney tries to migrate packages one by one, starting with the most > important packages. It doesn't consider them all together. ...which makes life hard for the release-team with transitions such as Haskell, where >200 source packages have to migrate at once, right? > britney generates a Heidi file. It has this format: > > > can also have the special value "source" for the source > package. The heidi file is then fed to "dak control-suite". Great, looks almost like the atoms I proposed. So from a satisfying assignment generating this is trivial. > > So here is my heuristic to handle this: If a Conflicts or Breaks has an > > upper version constraint, then add a corresponding clause. If not, > > ignore it. > > > > This might lead to uninstallable packages in testing in corner cases > > (imagine some package depending on both exim4 and postfix), but such a > > case is clearly an RC bug that ought to be filed by someone and which > > would be detected by a run of edos-distcheck on unstable already. > > I think it's more likely that such problems happen with conflicting > dependencies further down in the dependency tree (i.e. not directly on the > same package). Sure, but it would still be uncovered easily by edos-distcheck on unstable or testing, and subsequently be fixed like any other RC bug. (I know this is a weakness of the proposal, so I’m arguing hard that the weakness is not too serious :-)) > > PS: Please give honest feedback. I’d rather been told that this is > > rubbish and not spend time on it than spending time on it and afterwards > > noticing that it will not be used. > > Remember my feedback is one of an (interested) outsider, I have almost no > experience feeding hints to britney. Nevertheless, thanks for the feedback! Greetings from Manali, Joachim -- Joachim "nomeata" Breitner Debian Developer nome...@debian.org | ICQ# 74513189 | GPG-Keyid: 4743206C JID: nome...@joachim-breitner.de | http://people.debian.org/~nomeata signature.asc Description:
Re: SAT based britney
Heya, Raphael Hertzog writes: >> * What do you think of the advantages and improvements that I point >> out? > To me it looks great but I have no experience with SAT solver and I > wonder whether the sheer amount of predicates that it will have to handle > will not result in something dog-slow. I am pretty sure that this is not the case. SAT solvers are extremely efficient and reduction to SAT has replaced many specialised search algorithms because of its efficiency. Apart from that, one can always feed complex instances back to the SAT Race, making hundreds of researchers fine-tune their tools on the specific problem structure... However, finding minimal unsatisfiable cores (which are basically needed to get output for the "A might not migrate because of ..." bits) is harder, especially if they have an even weaker definition as in the MAX-SAT setting. For that, i would recommend reviewing the current literature. Marc pgpyF04Dj3TzH.pgp Description: PGP signature
Re: SAT based britney
Hi, Disclaimer: I'm not a member of the release team. On Wed, 11 May 2011, Joachim Breitner wrote: > * Do you get it? (i.e., did I explain it well enough or are there > questions left) I think so. > * Do you think it can handle all current and anticipatable > requirements? (I’m not involved in release team work and might missed > some conditions.) You mention that the tool can't deal properly with Conflicts for ensuring installability. I wonder if there aren't more similar problems related to this requirement of keeping package installable. For instance, I wonder how you'd deal with alternative dependencies? I can imagine that you can find out solutions where each package can be individually installed but where both sets of dependencies are not co-installable (in particular if you solve some dependencies with different versions of the same binary package, and here Conflicts is not involved). I don't know how britney does it either, but I know that it tries to never lower the resulting installability count of testing. > * What do you think of the advantages and improvements that I point > out? To me it looks great but I have no experience with SAT solver and I wonder whether the sheer amount of predicates that it will have to handle will not result in something dog-slow. > * Is it worth the hassle? Given your description, it doesn't look like too complicated to implement. I think it's worth trying at least if we are not able to find conceptual problems (or if we can solve them). > And before going into the details, here the high-level goals that I want > to achieve with SAT-britney (stupid working title): All of the goals outlined are great IMO. > The system is three-layered. On the lowest level is a general purpose > MAX-SAT-solver. It takes as input a propositional formula in conjunctive > normal form with some variables, and a list of “desired” variables. It Presented that way, it corresponds exactly to what is needed to implement the rolling distributions described in http://lists.debian.org/debian-devel/2011/05/msg00275.html Maybe it can be a good testbed for the concept. > * a satisfying assignment which is maximal, i.e. there is no satisfying > assignment where a strict superset of desired variables is true. FWIW, Britney tries to migrate packages one by one, starting with the most important packages. It doesn't consider them all together. > dependencies etc. It also needs to get the age and buggyness data. The > result of the solver will be used to create the new set of packages for > testing (how does this work, by the way – does britney create a new > Packages file, does she create a .changes file processed by dak or even > another way?). Initially, it could just create a large hint that can be britney generates a Heidi file. It has this format: can also have the special value "source" for the source package. The heidi file is then fed to "dak control-suite". > So here is my heuristic to handle this: If a Conflicts or Breaks has an > upper version constraint, then add a corresponding clause. If not, > ignore it. > > This might lead to uninstallable packages in testing in corner cases > (imagine some package depending on both exim4 and postfix), but such a > case is clearly an RC bug that ought to be filed by someone and which > would be detected by a run of edos-distcheck on unstable already. I think it's more likely that such problems happen with conflicting dependencies further down in the dependency tree (i.e. not directly on the same package). > what would migrate. Assume this RC bug is fixed, what would change). But > maybe so slow that it still can be used for britney, but only with one > run per migration (how often do packages migrate per day?). I think it's twice a day currently. > PS: Please give honest feedback. I’d rather been told that this is > rubbish and not spend time on it than spending time on it and afterwards > noticing that it will not be used. Remember my feedback is one of an (interested) outsider, I have almost no experience feeding hints to britney. Cheers, -- Raphaël Hertzog ◈ Debian Developer Follow my Debian News ▶ http://RaphaelHertzog.com (English) ▶ http://RaphaelHertzog.fr (Français) -- To UNSUBSCRIBE, email to debian-release-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20110513201757.ga24...@rivendell.home.ouaza.com
SAT based britney
Hi Release-Team, in this mail, I’d like to introduce an idea for an improved testing migration system that has been floating in my mind since over a year. I originally wanted to propose and implement it for last DebConf, but did not get sponsored. This year I will attend DebCamp and -Conf, so it is about time for a RFC. It is going to be a slighter long mail and if you are not interested in the testing migration software aka britney, you might want to skip the mail. If you do read it, though, I am especially interested in the answers to the following questions: * Do you get it? (i.e., did I explain it well enough or are there questions left) * Do you think it can handle all current and anticipatable requirements? (I’m not involved in release team work and might missed some conditions.) * What do you think of the advantages and improvements that I point out? * What other benefits that I do not see yet would come from this system? * Is it worth the hassle? And before going into the details, here the high-level goals that I want to achieve with SAT-britney (stupid working title): * Delegate as much logic to a general purpose tool (a MAX-SAT-solver), for less specific code to maintain and to benefit from improvements in that area. * Completeness: If some set of packages have to migrate together, SAT-britney will migrate that together without any manual hinting required. * Correctness: Packages in testing are always installable. (See below for a caveat with regard to Conflicts) * Intelligible output explaining why a package has not migrated. I want something that is much better than update_output.txt * Easy, comprehensive and powerful way of removing or adding constraints, e.g. force_hints, package removals. The system is three-layered. On the lowest level is a general purpose MAX-SAT-solver. It takes as input a propositional formula in conjunctive normal form with some variables, and a list of “desired” variables. It will then output * a satisfying assignment which is maximal, i.e. there is no satisfying assignment where a strict superset of desired variables is true. * for each desired variable which is not true, a minimal set of clauses which prevent this variable from being set to true. A MAX-SAT-solver can be implemented using a SAT-solver such as picosat and some iterations, but there might be room for improvements. For our purpose, this is a black box. The next layer adds some bells and whistles around the SAT-solver, but is still not Debian-specific. Now the variables are arbitrary strings without whitespace, with will, in our case, have suggestive names such as ghc_7.0.3-1_src/testing or ghc_7.0.3-1+b1_amd64/testing. The clauses can be a bit more expressive than CNF and, very important for some of the goals, can carry a comments (here separated by "by" or "because"). The layer would process a list of clauses such as: ghc_7.0.3-1+b1_amd64/testing implies ghc_7.0.3-1_src/testing by policy. ghc_7.0.3-1+b1_amd64/testing implies any of gcc_4:4.5.2-5_amd64/testing gcc_4:4.4.5-2_amd64/testing by ghc's dependency "gcc (>= 4:4.2)" ghc_7.0.3-1+b1_amd64/testing implies not ghc6_6.12.1-1_amd64/testing by ghc's conflict "ghc6 (<< 7)" ghc_7.0.3-1_src/testing implies all of ghc_7.0.3-1+b1_amd64/testing ghc_7.0.3-1+b1_i386/testing ghc_7.0.3-1_powerpc/testing by release architecture sync at most one of gcc_4:4.4.5-2_amd64/testing gcc_4:4.5.2-5_amd64/testing because one version per package per distribution not gcc_4:4.5.2-5_src/testing because gcc is too young (4 out of 10 days) not gcc_4:4.5.2-5_src/testing because gcc is RC-buggy: #12345, #67789 And a list of variables that we would like to be true (i.e. packages that should eventually migrate to testing): ghc_7.0.3-1_amd64/testing ghc_7.0.3-1_powerpc/testing gcc_4:4.5.2-5_amd64/testing The exact syntax is of course not fixed yet, but I think you can see at what I am aiming at: A language that is somewhat easy to understand and that allows us to express all the conditions we impose upon packages in testing : * binary package always with the source package * all release architectures are in sync * all dependencies fulfilled * all build-dependencies fulfilled (currently not enforced, but easily enabled in this system if desired) * if packages exists in testing in old version and is not removed in unstable, then it should not be removed from testing * no too young packages should migrate * no RC-buggy packages should migrate * what else? The syntax is also human-writable, for the override lists. Besides the input file containing the full set of clauses representing the usual migration policies, there would be one manually edited file of clauses to be ignored and to be added. It could look like this: ignore not gcc_4:4.5.2-5_src/testing because we like rc bugs in gcc in testing add udev_167-2_src/testing implies lvm2_2.02.84-2_src/testing because other combinations break, but the packages do not actually imply that The third la