Re: [gentoo-dev] Re: New category proposal

Brian Harring Wed, 11 May 2005 01:42:04 -0700

> On Wed, May 11, 2005 at 09:46:03AM +0200, Kevin F. Quinn wrote:
> Here's my suggestion, for what it's worth :)
> 
> The layout on disk and the semantics of categories do not need to be related. 
Yes and no.  You're assuming that people don't use the layout on disk for 
digging 
around without calling portage.  Personally, I do.


> I like the idea of using the first character of a package name as the 
> sub-directory name.  This can be extended more deeply as and when necessary 
> to 
> avoid over-large directories which cause problems on some filesystems.  e.g. 
> for sudo you get "s/sudo" and vim-sudo "v/vim-sudo".  This is 
> architecture-neutral, rsyncable, scalable, and not too difficult for users to 
> parse manually (see later for searching through categories).  If the 
> algorithm 
> portage would use to locate a package is such that it doesn't mandate the 
> depth 
> (i.e. tries "package", "p/package" if "p/" exists, "p/a/package" if "p/a/" 
> exists) then overlays can have a different depth to the rsync tree; if you 
> only 
> have a few packages in overlay then they need not be in subdirectories at all.
Re-asserting that the fs layout *does* matter, how is that more intuitive when 
trying 
to track down the ebuild for dev-util/diffball ?  How many directories deep 
would I have 
to go before I reached the ebuild?

The changes I posit aren't anymore friendly to devs doing ebuild work, and 
requires 
a flat namespace- no conflicts, meaning that we have to choose alternate names 
for conflicts 
(or the category data winds up in the name).  Like I said, I really dislike 
debian's flat 
namespace, even if we had a category component to it.

> The key here is to separate the category (metadata) and filesystem layout 
> (implementation detail) from the concept of package name.  This opens up all 
> sorts of possibilities, for example different layouts in CVS, on mirrors and 
> on 
> clients (some kind of custom rsync would be necessary) - but that's going 
> perhaps too far...
This also locks out several possibilities, like relying on dir structure to 
limit the searches.
You force category classification to be metadata, you need an additional db to 
do searching, 
and basic atom lookup.  That's 19000+ keys in a db.  No db, and you force a 
tree 
wide search, which _will_ be as fast as emerge -S is.

> Categories become metadata, formally (this is the root of the problem - 
> including the category in the package name is a pollution of the package 
> name). 
>  Once they become properly understood and implemented as metadata, a package 
> being a member of more than one category is a natural consequence.
Currently, the only conflicts that can occur in searches are package specific.  
Atoms, 
the basis of our depends system require categories; as such conflicts *cannot* 
occur.
Multiple categories per package allows for conflicts to occur in our deps.  
This is 
nasty, and again, requires pretty much a walk of the whole tree to verify no 
conflicts 
(mr_bones_, aka michael sterret would probably quietly curl up and die when his 
repoman runs, 
which are now under an hour, clear 2 hours again) :)

> Portage would essentially ignore categories.  Some support would be necessary 
> to allow the user to query categories (since 'ls /usr/portage/<category>' 
> would 
> no longer work) - but searching for packages is already a function and would 
> just need to be adapted (and perhaps optimised ;) ).  Indeed just listing out 
> portage directories at the moment is often insufficient to find a suitable 
> package, since package names are often amusing but uninformative acronyms.
Portage can't ignore categories, see the bit above about cat/pkg-ver (cpv from 
this point
on) conflicts.  cpvs can't conflict, pure and simple under the current layout, 
which is 
enforce by the single category/fs layout.

What are we gaining?  Ability to find a package under two categories?


> The benefits include
> 1) no more "moving packages around the tree"
cpv conflict.  You aren't moving the fs position of it, but it still requires 
walking the tree and updating all atom's that reference the old position.  

> 2) categories can be added to a package in the most natural way
Elaborate.

> 3) overlays can be tidier
Eh?

> well, it's a big downside...
E'yep. :)

> Having said that, some things could be done now.  If a flat package namespace 
> is desirable, the existing package name clashes could be resolved by renaming 
> the few packages that clash.
74 packages, roughly, out of 9429 roughly.

>  Category could be added as a field in 
> metadata.xml, so that a package could "belong" to multiple categories.
>  The query/search tools could be enhanced to scan this metadata (perhaps 
> including 
> the current category directory as an implied entry in the metadata.xml).
If that's the goal of the "belong to N categories" thread, strictly searching, 
sure, although I don't like it.  It can't become an atom for *DEPEND due to the 
cpv 
nonconflicting bit.
~brian
-- 
gentoo-dev@gentoo.org mailing list

Re: [gentoo-dev] Re: New category proposal

Reply via email to