Re: A new metric for source package importance in ports

2013-11-28 Thread Johannes Schauer
Hi,

Quoting Dmitrijs Ledkovs (2013-11-28 01:15:06)
> On 28 November 2013 00:04, Steven Chamberlain  wrote:
> > I also find it interesting to see openjdk-7 listed but not gcj;  or even
> > gcc-4.8.  Was this computed for jessie or sid?
> 
> I guess implicit relationships are not considered: build-essential
> build-dependencies, and essential dependencies. I would expect for
> packages in those to sets have the highest rank, since,
> hypothetically, all packages in debian build-depend & depend on those.

Steven was looking at the second graph which (in contrast to the first graph)
makes the assumption that essential:yes and build-essential are already
available somehow (for example by having cross compiled them) and thus do not
need to be recompiled to bootstrap the port.

gcj and gcc-4.8 is part of the packages which are drawn in by creating a
co-installation set of essential:yes and build-essential packages. Therefore
they do not appear in the second graph.

Since this co-installation set is an input to the algorithm of creating the
second graph, they implicitly receive the highest rank. For the same reason you
will also see them being assigned the highest rank in the first graph which
does not assume that essential:yes and build-essential do not have to be
recompiled.

Implicit dependency relationships are considered by both algorithms to
calculate the strong dependencies and the dependency closure of source and
binary packages. My code uses dose3 to do the required calculations.

cheers, josch


--
To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20131128080700.2752.98455@hoothoot



Re: A new metric for source package importance in ports

2013-11-28 Thread Johannes Schauer
Hi,

Quoting Steven Chamberlain (2013-11-28 01:04:56)
> On 27/11/13 17:58, Johannes Schauer wrote:
> > http://mister-muffin.de/p/Gid8.txt
> > 
> > One can see that now the amount of source packages which is needed to build 
> > the
> > rest of the archive is only 383.
> 
> So, there are 383 packages that share the same, maximum value (in this
> case 11657) in the second column?

Correct.

$ curl -s http://mister-muffin.de/p/Gid8.txt | awk '{ if ($2=="11657") print $0 
}' | wc -l
383

In this particular graph the maximum value of the second column (11657) is less
than the total amount of source packages (in contrast to the first graph)
because this latter graph assumes that arch:all, essential:yes and
build-essential do not have to be rebuild. Therefore, lots of source packages
do not have to be compiled at all.

> > Does anybody see enough value in these numbers for source package
> > importance in the light of bootstrapping Debian (either for a new port or
> > for rebuilding the archive from scratch)?
> 
> I find the list of 383 packages interesting, at least.  I think this
> closure is what I had in mind[0] for regular testing of ports' toolchains and
> reproducibility of builds.

In that email you wrote:

> Some people have been trying to identify small sets of essential packages
> already, in the context of bootstrapping an architecture[1].  I wonder if
> that's likely to overlap with this?  It encompasses toolchain and essential
> arch-specific packages.
> 
> I imagine a healthy port should be able to bootstrap itself with only current
> package versions.  If this was being tested regularly it could let porters
> know if circular dependencies are introduced

Yes, if you omit the necessity to rebuild arch:all packages, then these 383
source packages are about what you were talking about: the set of source
packages which makes a port able to "bootstrap" itself. Though notice that this
number (383) is only the very lower bound because it was deducted using strong
dependencies only. You can see the upper bound in the column that was
calculated using the closure graph which would be 457 source packages.

If you also want to rebuild arch:all packages, then you have to look at the
first graph and then the number quickly climbs to 1194 source packages minimum
and 1424 source packages maximum.

> Does the list vary by architecture?  I see many odd things in here such as
> 'systemd' and 'redhat-cluster' which would be unavailable if trying to
> bootstrap a non-Linux port, for example.

Yes it does vary by architecture because dependencies can have architecture
qualifiers. Here, I used amd64 as an example.

> I also find it interesting to see openjdk-7 listed but not gcj;  or even
> gcc-4.8.  Was this computed for jessie or sid?

Using Debian Sid as of yesterday.

cheers, josch


--
To UNSUBSCRIBE, email to debian-amd64-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org
Archive: http://lists.debian.org/20131128080012.2752.32993@hoothoot