[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future
Alec Warner [EMAIL PROTECTED] writes: - Space savings. Certainly your scheme may be smaller, but the XML tag overhead may eat into the savings. You should do some estimates to show the community how much smaller the tree will be from this proposal. Sorry but you lost me on any point you might have brought across since after this I feel like you were trying to put words in my mouth. Beside, if you really want to go down that road you should be counting that beside ReiserFS with tail, I don't remember any other Linux FS that has block smaller than 512bytes, which means that each file in metadata cache is taking up much more than just its size in characters. All your math is thus wrong. -- Diego Flameeyes Pettenò http://blog.flameeyes.eu/ pgpcFFtmtOt8h.pgp Description: PGP signature
[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future
Jan Kundrát [EMAIL PROTECTED] writes: But also the need to replicate http://www.kde.org/ to metadata.xml of all KDE split ebuilds -- right now, this is set by an eclass. The usefulness of this is IMHO debatable; why not just writing it one package (say kde-base/kde or kde-meta) and just there? Having each mini-package express itself as having that as its homepage is not very useful to me, but I guess it's debatable. - allows proper handling of packages lacking a HOMEPAGE; Could you elaborate a bit about how different is handling of an empty/uninitialized shell variable from an empty XML element? That you can provide _other_ links beside an homepage, like unmaintained, gentoo:userguide and stuff like that so that user don't just get no homepage at all, and they are not misdirected by homepage being http://www.gentoo.org/ or something. - users can check the metadata much more easily by just opening the xml file or interfacing to that rather than having to skim through the ebuild, the xml files are probably more user readable then ebuilds using multiple eclasses; Haven't we already agreed that accessing ebuilds/... directly is broken by design? For a software sure, but as an user I am automatically brought to just look at the files if I'm looking for the homepage of a package I know, and seeing a metadata.xml file I'm more likely to look at that rather than the metadata cache in /var/db/... . And it's certainly more user-readable an XML file than HOMEPAGE with depend-like syntax for labels and conditionals and whatever else seems like Alec is proposing for EAPI=3 - webapps like packages.gentoo.org would be able to display basic information without having to parse the ebuilds or the metadata cache. Except for the ebuilds which still use the old format (that is 100% of the tree right now) This of course is meant as whenever this is fully implemented -- Diego Flameeyes Pettenò http://blog.flameeyes.eu/ pgpfOxlYEmqMh.pgp Description: PGP signature
[gentoo-dev] Re: debug/release builds extensions/clarification proposal
Maciej Mrozowski [EMAIL PROTECTED] writes: - USE=debug is useless when CFLAGS/LDFLAGS or FEATURES are not appropriate What are you saying here? I'm afraid you're mistaken here. For the most part, USE=debug means enable debug code paths, which for lots of projects simply means enable assertions; there are packages that take this as enable debug symbols too but I don't think that's very valid since users might want debug code paths but not symbols and vice-versa (I indeed have debug symbols bug no debug codepaths enabled). Now just to make sure the common misconceptions don't hit again: - -ggdb *does not have any runtime performance hit*; neither in execution time nor in memory usage; the debug sections are not mapped into memory at all; this is true for both non-stripped and split executables; - -O0 is not always a good idea; beside bugs in packages concealed by -O1+ [1], there are some further points: missing registers on x86 causes build failures, and if ( 0 ) cases are not optimised away, resulting in stuff like FFmpeg not to link properly since undefined references are not pruned away; this means that using -O0 unconditionally for any package for debug is not really an option; [1] http://blog.flameeyes.eu/2008/09/02/testing-the-corner-cases -- Diego Flameeyes Pettenò http://blog.flameeyes.eu/ pgpJ5ioEeo3pE.pgp Description: PGP signature
Re: [gentoo-dev] debug/release builds extensions/clarification proposal
В Пнд, 01/12/2008 в 06:16 +0100, Maciej Mrozowski пишет: Currently handling debug/release builds is incoherent and misleading to say the least. We have got in Gentoo: All that parts do their separate and quite a different work so I can't say that it's incoherent (by idea at least). The drawbacks are as follows: - USE=debug is useless when CFLAGS/LDFLAGS or FEATURES are not appropriate USE=debug enables additional debug output or more assertions in the code. It's hard to tell in advance in details what USE=debug does since different packages enable different things. But generally it adds additional code with -DDEBUG and this is independent of CFLAGS/LDFLAGS. If you know packages where this is not true, fill bugs on them. - CFLAGS/LDFLAGS must be set globally when they are about to be supported - those who don't want to set them globally, they are forced to use (very flexible and great indeed) /etc/portage/env hack - which is undocumented and unsupported, because everything user set there, is not shown by emerge --info, thus bug reports from such machines are not taken into consideration, as virtually everything that breaks can be there This leads me to different conclusion. I was thinking about new portage feature: emerge --info pkg . So to make portage show not only global information but per-package either. In many cases this will simplify analyzing of the problem. - too much choice leads to confusion That's always true. But we use Gentoo because we enjoy our freedom to choose... Rigth? :) Implementation is trivial - eclass would be responsible for handling USE=debug flag, when debug is set: - replace CFLAGS with CFLAGS_DEBUG, LDFLAGS with LDFLAGS_DEBUG and possibly others - replace FEATURES with FEATURES_DEBUG USE flags should never change {C,LD}FLAGS or FEATURES as they are different things and such relation between USE flags, {C,LD}FLAGS and/or FEATURES will lead to even more confusion. (also there is complexity Duncan told you about...) Personally to get build with symbols I use a trivial wrapper around emerge: demerge() { env USE=debug CFLAGS=-O2 -pipe -g -ggdb PKGDIR=/vt/binpkg-debug \ FEATURES=buildpkg splitdebug collision-protect ccache noclean installsources \ emerge $@ } and I use demerge whenever I need to debug package. I'm sure this is just a quick hack which could be greatly improved to track which packages are installed with or without symbols. But you got an idea: such thing are better to do with separate, but very tiny and simple wrappers. P.S. I remember most of this was already discussed in this mailing list. Try to search it and you'll find much more ideas and motivations. -- Peter.
Re: [gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future
On Mon, Dec 1, 2008 at 12:24 AM, Diego 'Flameeyes' Pettenò [EMAIL PROTECTED] wrote: Alec Warner [EMAIL PROTECTED] writes: - Space savings. Certainly your scheme may be smaller, but the XML tag overhead may eat into the savings. You should do some estimates to show the community how much smaller the tree will be from this proposal. Sorry but you lost me on any point you might have brought across since after this I feel like you were trying to put words in my mouth. Sorry for that, I never meant to imply that you said space savings. That being said I still don't see the usefulness here. You seem to think that using the existing APIs for this data is wrong, and I think the opposite, so I guess we will agree to disagree on this matter. Beside, if you really want to go down that road you should be counting that beside ReiserFS with tail, I don't remember any other Linux FS that has block smaller than 512bytes, which means that each file in metadata cache is taking up much more than just its size in characters. All your math is thus wrong. As was pointed out on IRC, UTF8 characters are not a fixed size, making my math even more wrong ;) -- Diego Flameeyes Pettenò http://blog.flameeyes.eu/
[gentoo-dev] Re: [RFC] Moving HOMEPAGE out of ebuilds for the future
Alec Warner [EMAIL PROTECTED] writes: That being said I still don't see the usefulness here. You seem to think that using the existing APIs for this data is wrong, and I think the opposite, so I guess we will agree to disagree on this matter. Yeah I still think that there is no point in requiring using of a specific API when the same data can easily be available in a format that is more or less parsable with ease in any modern (and non) programming language. Beside, I find expanding the HOMEPAGE syntax to allow more than one link a bit ... overkill, if the same thing can be achieved in metadata.xml... Beside, if you really want to go down that road you should be counting that beside ReiserFS with tail, I don't remember any other Linux FS that has block smaller than 512bytes, which means that each file in metadata cache is taking up much more than just its size in characters. All your math is thus wrong. As was pointed out on IRC, UTF8 characters are not a fixed size, making my math even more wrong ;) If we consider HOMEPAGE, the assumption that characters are fixed size to 1 byte is good enough; URLs are usually encoded in pure ascii character space for compatibility; while IDN would break that assumption, we can't even assume that IDN is always available and so on. For description maybe it's different because there is space there for UTF-8 characters, but that's going to bring us even farthest than the point. -- Diego Flameeyes Pettenò http://blog.flameeyes.eu/ pgpfg98QhGFq3.pgp Description: PGP signature
Re: [gentoo-dev] Re: debug/release builds extensions/clarification proposal
On Monday 01 of December 2008 08:04:04 Duncan wrote: Well, so far it's not GLEP, just an idea thrown to brainstorm. As such, neither /etc/portage/env nor eclasses can effectively deal with FEATURES in general, tho there are a few specific exceptions that do happen to be implemented at the bash level. Those exceptions are nostrip and splitdebug at least, besides I intend to keep it bash (or ebuild) level only - to preserve simplicity and yet functionality. FEATURES_DEBUG was a clean and convenient approach of me being unaware of FEATURES internals - thanks for clarification. FEATURES little inconsistency problem needs to be addressed. The goal is to have only one, determined and always working way of not-stripping symbols. Of course it can be easily handled in eclass by something like this: if use debug; then FEATURES=${FEATURES//splitdebug//} FEATURES=${FEATURES//nostrip//} FEATURES=${FEATURES} ${PREFERRED_NOSTRIP_METHOD} Dzwon tanio do wszystkich! Sprawdz http://link.interia.pl/f1fa7
[gentoo-dev] Re: [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official
Hi, Dale [EMAIL PROTECTED]: If you have a GUI on your system, give this a look: app-portage/elogviewer That should help you a lot. I been using it for a good while and it works pretty well. I do wish it had little flags in the list of packages that have been installed. Sort of a short and sweet notice there is something there without actually have to look. Maybe a red flag when there is something really serious to know and other colors for other things. app-portage/elogv (ncurses) and app-portage/kelogviewer (Qt based) are really nice, too. Unfortunately the two GUI variants are homeless, so improvements won't happen from the original upstream. V-Li -- Christian Faulhammer, Gentoo Lisp project URL:http://www.gentoo.org/proj/en/lisp/, #gentoo-lisp on FreeNode URL:http://www.faulhammer.org/ signature.asc Description: PGP signature
Re: [gentoo-dev] Re: debug/release builds extensions/clarification proposal
On Monday 01 of December 2008 09:36:12 Diego 'Flameeyes' Pettenò wrote: - USE=debug is useless when CFLAGS/LDFLAGS or FEATURES are not appropriate What are you saying here? I'm afraid you're mistaken here. The point is to look at this from users' (well, a bit) point of view - USE=debug variable is ambiguous in it's meaning. While it enables only codepaths (asserts, #ifdefs and similar) it suggests (by name and for some packages not only suggests) enabling debug symbols. And policy is to enforce CFLAGS from make.conf and wipe out every package- defined flags as far as I know. For the most part, USE=debug means enable debug code paths, which for lots of projects simply means enable assertions; there are packages that take this as enable debug symbols too but I don't think that's very valid since users might want debug code paths but not symbols and vice-versa (I indeed have debug symbols bug no debug codepaths enabled). That's correct, the problem is - Gentoo does not provide officially supported mechanism of enabling both or just debug symbols per package basis - it doesn't even provide any supported/documented mechanism for per package CFLAGS, FEATURES and similar. If /etc/portage/env hack/feature could be made official (for CFLAGS,LDFLAGS and bash-domain FEATURES) - it could address this issue good enough, because with proper smart combination of symlinks/files the ultimate configuration power would be delivered, not just cleaning/workaround I am actually proposing. Per package debug/release/profile/or_any_other configuration is what I would pursue, and in my proposal I used USE=debug as existing and supported way of achieving this. While I don't like hack @pve uses (I prefer portage/env as more convenient way), his idea about emerge --info pkg seems interesting. - -ggdb *does not have any runtime performance hit*; neither in Yes, I'm well aware of that, though it increases disk space requirements a bit as it's applied to all libs/bins. - -O0 is not always a good idea; beside bugs in packages concealed by -O1+ [1] [1] is a pathology and should be fought against, -O1+ may leave frame stack useless for debugging due to inline optimizations in some places (especially debugging inline class implementations is limited, which affects Qt/KDE) - besides - I may not stated it clear - those default values would be defined in the very same make.conf, so it could be: CHOST=x86_64-pc-linux-gnu CFLAGS=-march=nocona -O2 -pipe -msse3 -ftree-vectorize CXXFLAGS=${CFLAGS} CFLAGS_DEBUG=-O2 -ggdb Yet, I still cannot think of this proposal other way like of dirty workaround for the problem, that doesn't really exist (well, at least for developers, who have meta-distribution and ultimate freedom for user in mind). For the users the problem is real, of course it's usually a consequence of either not being aware of those mechanisms or as a result of ambiguous semantics of USE=debug. And what about pushing some bash-domain FEATURES to USE flags? Like nostrip, splitdebug? I guess being able to set it per package is important. -- regards MM signature.asc Description: This is a digitally signed message part.
[gentoo-dev] Monthly Gentoo Council Reminder for December
This is your monthly friendly reminder ! Same bat time (typically the 2nd Thursday at 2000 UTC / 1600 EST), same bat channel (#gentoo-council @ irc.freenode.net) ! If you have something you'd wish for us to chat about, maybe even vote on, let us know ! Simply reply to this e-mail for the whole Gentoo dev list to see. Keep in mind that every GLEP *re*submission to the council for review must first be sent to the gentoo-dev mailing list 7 days (minimum) before being submitted as an agenda item which itself occurs 7 days before the meeting. Simply put, the gentoo-dev mailing list must be notified at least 14 days before the meeting itself. For more info on the Gentoo Council, feel free to browse our homepage: http://www.gentoo.org/proj/en/council/
Re: [gentoo-dev] Monthly Gentoo Council Reminder for December
On 01 Dec 2008 05:30:01 Mike Frysinger [EMAIL PROTECTED] wrote: If you have something you'd wish for us to chat about, maybe even vote on, let us know ! Simply reply to this e-mail for the whole Gentoo dev list to see. Please give the OK on the following, assuming no objections crop up before then: * [RFC] Label profiles with EAPI for compatibility checks (revised) http://archives.gentoo.org/gentoo-dev/msg_930f58fcebcbbcbe523c001f2c825179.xml * EAPI change: Call ebuild functions from trusted working directory http://archives.gentoo.org/gentoo-dev/msg_5ba467bbd5a0820e040210683702a67f.xml * RFC: DEFINED_PHASES magic metadata variable http://archives.gentoo.org/gentoo-dev/msg_8c34d8efbc0d31ab28c517403dc83f62.xml -- Ciaran McCreesh signature.asc Description: PGP signature
Re: [gentoo-dev] Jeeves IRC replacement now alive - Willikins
On Wed, Aug 6, 2008 at 3:18 PM, Robin H. Johnson [EMAIL PROTECTED] wrote: Getting the bot out there - If you would like to have the new bot in your #gentoo-* channel, would each channel founder/leader please respond to this thread, stating the channel name, and that they are the contact for any problems/troubles. Hi, #gentoo-prefix please. I am the channel founder and am available on irc for 'issues' Thanks, Jeremy
Re: [gentoo-dev] debug/release builds extensions/clarification proposal
On Mon, 01 Dec 2008 11:39:35 +0300 Peter Volkov [EMAIL PROTECTED] wrote: This leads me to different conclusion. I was thinking about new portage feature: emerge --info pkg . So to make portage show not only global information but per-package either. In many cases this will simplify analyzing of the problem. That feature already exists (for installed packages at least). Marius
Re: [gentoo-dev] Re: debug/release builds extensions/clarification proposal
On Monday 01 of December 2008 08:04:04 Duncan wrote: (Of course, if it's the latter, it will need to be an official GLEP, and you'll have three separate package managers and their developers to push the proposal thru to at least to general agreement, or the council will almost certainly reject the GLEP, if it gets even that far.) That I found interesting - what does any 3rd party package manager to do with setting policies and enhancements regarding official Gentoo package manager? Have you ever heard of liberum veto? But that's an off topic of course. -- regards MM signature.asc Description: This is a digitally signed message part.
Re: [gentoo-dev] [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official
Summarizing from what I've read in this thread it seems you want to find a way to help user find information s/he doesn't look for. If users aren't curious about their system they will sure have a hard time figuring out how to fix it if needs be. PORTAGE_ELOG_* isn't really that hard to find in the make.conf.example (even though it's new location makes it a bit harder to find). As others have said, there are already proper systems, documentation and linking through other docs. Not finding this is what I'd call lazyness or lack of google foo. Don't misunderstand me, some stuff can get ouf of the radar of everyone, it's ok, real people are still here to point you in the right direction. If you find a better way to convey these information to the users, then please surprise me. For now I think we are in a good shape. -- Gilles Dartiguelongue [EMAIL PROTECTED] Gentoo signature.asc Description: Ceci est une partie de message numériquement signée
Re: [gentoo-dev] [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official
Gilles Dartiguelongue wrote: As others have said, there are already proper systems, documentation and linking through other docs. Not finding this is what I'd call lazyness or lack of google foo. Don't misunderstand me, some stuff can get ouf of the radar of everyone, it's ok, real people are still here to point you in the right direction. I think that I probably did not express my idea as well as I could have, since most of the responses I have gotten have echoed your thoughts that Gentoo does, indeed, have the facilities to achieve flexibility in logging, etc. I totally agree. Gentoo's capabilities, although not perfect, of course, are superlative and are a complement to its superb online doc. I think that's a big reason why we're all here - we see this and appreciate this. In fact, even when I do not include the word gentoo in a Google search, I more often than not end up at a Gentoo doc page - this is impressive. However, what I see as perhaps a missing piece is more conceptual: the important connection between the valuable info in the emerge logs (and their somewhat transient default nature) and what a user looks for when he/she has a problem with a package. Yes, users will realize this as they use Gentoo (and will start paying more attention to logs as a result), so I don't think it's a huge problem, but what this particular user said to me made me think that there is, perhaps, an opportunity to improve the situation. There is no Gentoo-specific readme facility, which could be the obvious and de facto place to go when trouble is had. I can imagine that a fairly simple and low-effort way of starting such a resource would be to simple echo the log output into a package-specific file in a known place (or put it in the portage db). The logging facilities allow similar things if configured to do it, but it is not on by default. Once users know where to go to see the instructions or notes on getting a package up and running after installation, this would become a good place to have such info or to expand on how the facility works. Starting with just the plain emerge log output would be an easy way to get benefit of such a concept has merit. And by no means would such a thing be an attempt to replace the excellent on-line docs or wiki, either - I see both as having unique strengths. For example, for detailed info on packages, the wiki/web stuff is the better resource. For a quick check of whether a revdep-rebuild might have been necessary after installing a new package would typically be in the log/notes. The notes also have the key advantage that they would *always* contain what the log output was, whereas whether a wiki or web page exists on a particular package depends on whether someone spent the time to author one. My intention with the RFC was to see if the concept has any worth and to kick it around a bit. I do not really see this as a deficiency in Gentoo's technology (which I have a feeling is how many here have interpreted it), but simply something that, if done correctly, could be useful. -Joe
Re: [gentoo-dev] [RFC] Saving package emerge output (einfo, elog, ewarn, etc.) somewhere official
On Mon, 01 Dec 2008 15:35:32 -0700 Joe Peterson [EMAIL PROTECTED] wrote: My intention with the RFC was to see if the concept has any worth and to kick it around a bit. I do not really see this as a deficiency in Gentoo's technology (which I have a feeling is how many here have interpreted it), but simply something that, if done correctly, could be useful. Maybe provide a real example to demonstrate the difference between the current solutions and what you're looking for, because I still don't understand what you're after (using all the different terms, logs, notes, docs, plain emerge log, ... without further explanation doesn't help much to clear things up). Marius
Re: [gentoo-portage-dev] Re: search functionality in emerge
I completely forgot about Google's Summer of Code! Thanks for reminding me. Hopefully I won't forget again by the time summer rolls around, obviously I wouldn't mind getting a little extra money for doing something I'd do for free anyway. On a more related note: What, exactly, does porttree.py do? And am I correct in thinking that my suffix tree(s) should somewhat replace porttree.py? Or, should I be using porttree.py in order to populate my tree? I think I have the suffix tree sufficiently figured out, I'm just trying to determine where, exactly, the tree will fit in to the portage code, and what the best way to populate it (with package names and some corresponding metadata) would be. On Mon, Dec 1, 2008 at 2:34 AM, Duncan [EMAIL PROTECTED] wrote: Emma Strubell [EMAIL PROTECTED] posted [EMAIL PROTECTED], excerpted below, on Sun, 30 Nov 2008 18:42:11 -0500: i am really interested in contributing to Gentoo and portage in the future, though. I'm thinking this summer I'll have a chance... FWIW, Gentoo usually participates in the Google Summer of Code. Assuming they have it again next year, if you're already considering spending some time on Gentoo code this summer, might as well try to get paid a little something for it. It could/should be a nice resume booster, too. =:^) -- Duncan - List replies preferred. No HTML msgs. Every nonfree program has a lord, a master -- and if you use the program, he is your master. Richard Stallman
Re: [gentoo-portage-dev] Re: search functionality in emerge
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Emma Strubell wrote: I completely forgot about Google's Summer of Code! Thanks for reminding me. Hopefully I won't forget again by the time summer rolls around, obviously I wouldn't mind getting a little extra money for doing something I'd do for free anyway. On a more related note: What, exactly, does porttree.py do? And am I correct in thinking that my suffix tree(s) should somewhat replace porttree.py? Or, should I be using porttree.py in order to populate my tree? You should use portree.py to populate it. Specifically, you should use portdbapi.aux_get() calls to access the package metadata that you'll need, similar to how the code in the existing search class accesses it. I think I have the suffix tree sufficiently figured out, I'm just trying to determine where, exactly, the tree will fit in to the portage code, and what the best way to populate it (with package names and some corresponding metadata) would be. There are there possible times that I imagine a person might want to populate it: 1) Automatically after emerge --sync. This should not be mandatory since it will be somewhat time consuming and some users are very sensitive about --sync time. Note that FEATURES=metadate-transfer is disabled by default in the latest versions of portage, specifically to reduce --sync time. 2) On demand, when emerge --search is invoked. The calling user will need appropriate file system permissions in order to update the search index. 3) On request, by calling a command that is specifically designed to generate the search index. This could be a subcommand of emaint. For the index file format, it would be simplest to use a python pickle file, but you might choose another format if you'd like the index to be accessible without python and the portage API (probably not necessary). - -- Thanks, Zac -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2 gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR =hFNz -END PGP SIGNATURE-
Re: [gentoo-portage-dev] Re: search functionality in emerge
Thanks for the clarification. I was planning on forcing an update of the index as a part of emerge --sync, and implementing a command that would update the search index (leaving it up to the user to update after making any manual changes to the portage tree). That way the search index should always be up-to-date when emerge -s is called. It does make sense for the update upon --sync to be optional, but I guess I don't see why the update should always be SO slow. Of course the first population of the tree will take quite a while, but assuming regular (daily?) --syncs (and therefore updates to the index), subsequent updates shouldn't take very long, since there will only be a few (hundred?) changes to be made to the tree. And I do plan on using a pickling the search tree :] Emma On Mon, Dec 1, 2008 at 12:52 PM, Zac Medico [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Emma Strubell wrote: I completely forgot about Google's Summer of Code! Thanks for reminding me. Hopefully I won't forget again by the time summer rolls around, obviously I wouldn't mind getting a little extra money for doing something I'd do for free anyway. On a more related note: What, exactly, does porttree.py do? And am I correct in thinking that my suffix tree(s) should somewhat replace porttree.py? Or, should I be using porttree.py in order to populate my tree? You should use portree.py to populate it. Specifically, you should use portdbapi.aux_get() calls to access the package metadata that you'll need, similar to how the code in the existing search class accesses it. I think I have the suffix tree sufficiently figured out, I'm just trying to determine where, exactly, the tree will fit in to the portage code, and what the best way to populate it (with package names and some corresponding metadata) would be. There are there possible times that I imagine a person might want to populate it: 1) Automatically after emerge --sync. This should not be mandatory since it will be somewhat time consuming and some users are very sensitive about --sync time. Note that FEATURES=metadate-transfer is disabled by default in the latest versions of portage, specifically to reduce --sync time. 2) On demand, when emerge --search is invoked. The calling user will need appropriate file system permissions in order to update the search index. 3) On request, by calling a command that is specifically designed to generate the search index. This could be a subcommand of emaint. For the index file format, it would be simplest to use a python pickle file, but you might choose another format if you'd like the index to be accessible without python and the portage API (probably not necessary). - -- Thanks, Zac -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2 gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR =hFNz -END PGP SIGNATURE-
Re: [gentoo-portage-dev] Re: search functionality in emerge
I would suggest a different way of updates. When you manually change portage tree, you have to make an overlay. Overlay, as it's updated and managed by human being, will be always small (unless someone makes a script, which creates million overlay updates, but I dont think it would be efficient way to do anything). So, when you search, you can search Portage tree with index, which is updated with --sync and then search overlay, which is small and fast to search anyway. Overlay should not have index in such case. If anyone is going to change portage tree by hand, those changes will be lost with next --sync and thus noone should do it anyway - this case should not be considered at all. Tambet - technique evolves to art, art evolves to magic, magic evolves to just doing. 2008/12/1 Emma Strubell [EMAIL PROTECTED] Thanks for the clarification. I was planning on forcing an update of the index as a part of emerge --sync, and implementing a command that would update the search index (leaving it up to the user to update after making any manual changes to the portage tree). That way the search index should always be up-to-date when emerge -s is called. It does make sense for the update upon --sync to be optional, but I guess I don't see why the update should always be SO slow. Of course the first population of the tree will take quite a while, but assuming regular (daily?) --syncs (and therefore updates to the index), subsequent updates shouldn't take very long, since there will only be a few (hundred?) changes to be made to the tree. And I do plan on using a pickling the search tree :] Emma On Mon, Dec 1, 2008 at 12:52 PM, Zac Medico [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Emma Strubell wrote: I completely forgot about Google's Summer of Code! Thanks for reminding me. Hopefully I won't forget again by the time summer rolls around, obviously I wouldn't mind getting a little extra money for doing something I'd do for free anyway. On a more related note: What, exactly, does porttree.py do? And am I correct in thinking that my suffix tree(s) should somewhat replace porttree.py? Or, should I be using porttree.py in order to populate my tree? You should use portree.py to populate it. Specifically, you should use portdbapi.aux_get() calls to access the package metadata that you'll need, similar to how the code in the existing search class accesses it. I think I have the suffix tree sufficiently figured out, I'm just trying to determine where, exactly, the tree will fit in to the portage code, and what the best way to populate it (with package names and some corresponding metadata) would be. There are there possible times that I imagine a person might want to populate it: 1) Automatically after emerge --sync. This should not be mandatory since it will be somewhat time consuming and some users are very sensitive about --sync time. Note that FEATURES=metadate-transfer is disabled by default in the latest versions of portage, specifically to reduce --sync time. 2) On demand, when emerge --search is invoked. The calling user will need appropriate file system permissions in order to update the search index. 3) On request, by calling a command that is specifically designed to generate the search index. This could be a subcommand of emaint. For the index file format, it would be simplest to use a python pickle file, but you might choose another format if you'd like the index to be accessible without python and the portage API (probably not necessary). - -- Thanks, Zac -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2 gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR =hFNz -END PGP SIGNATURE-
Re: [gentoo-portage-dev] Re: search functionality in emerge
Good point. I may just ignore overlays completely because 1) I don't use them and 2) does anyone really need to search an overlay anyway? aren't any packages added via an overlay added deliberately? On Mon, Dec 1, 2008 at 4:52 PM, Tambet [EMAIL PROTECTED] wrote: I would suggest a different way of updates. When you manually change portage tree, you have to make an overlay. Overlay, as it's updated and managed by human being, will be always small (unless someone makes a script, which creates million overlay updates, but I dont think it would be efficient way to do anything). So, when you search, you can search Portage tree with index, which is updated with --sync and then search overlay, which is small and fast to search anyway. Overlay should not have index in such case. If anyone is going to change portage tree by hand, those changes will be lost with next --sync and thus noone should do it anyway - this case should not be considered at all. Tambet - technique evolves to art, art evolves to magic, magic evolves to just doing. 2008/12/1 Emma Strubell [EMAIL PROTECTED] Thanks for the clarification. I was planning on forcing an update of the index as a part of emerge --sync, and implementing a command that would update the search index (leaving it up to the user to update after making any manual changes to the portage tree). That way the search index should always be up-to-date when emerge -s is called. It does make sense for the update upon --sync to be optional, but I guess I don't see why the update should always be SO slow. Of course the first population of the tree will take quite a while, but assuming regular (daily?) --syncs (and therefore updates to the index), subsequent updates shouldn't take very long, since there will only be a few (hundred?) changes to be made to the tree. And I do plan on using a pickling the search tree :] Emma On Mon, Dec 1, 2008 at 12:52 PM, Zac Medico [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Emma Strubell wrote: I completely forgot about Google's Summer of Code! Thanks for reminding me. Hopefully I won't forget again by the time summer rolls around, obviously I wouldn't mind getting a little extra money for doing something I'd do for free anyway. On a more related note: What, exactly, does porttree.py do? And am I correct in thinking that my suffix tree(s) should somewhat replace porttree.py? Or, should I be using porttree.py in order to populate my tree? You should use portree.py to populate it. Specifically, you should use portdbapi.aux_get() calls to access the package metadata that you'll need, similar to how the code in the existing search class accesses it. I think I have the suffix tree sufficiently figured out, I'm just trying to determine where, exactly, the tree will fit in to the portage code, and what the best way to populate it (with package names and some corresponding metadata) would be. There are there possible times that I imagine a person might want to populate it: 1) Automatically after emerge --sync. This should not be mandatory since it will be somewhat time consuming and some users are very sensitive about --sync time. Note that FEATURES=metadate-transfer is disabled by default in the latest versions of portage, specifically to reduce --sync time. 2) On demand, when emerge --search is invoked. The calling user will need appropriate file system permissions in order to update the search index. 3) On request, by calling a command that is specifically designed to generate the search index. This could be a subcommand of emaint. For the index file format, it would be simplest to use a python pickle file, but you might choose another format if you'd like the index to be accessible without python and the portage API (probably not necessary). - -- Thanks, Zac -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) iEYEARECAAYFAkk0JFAACgkQ/ejvha5XGaONDACgixnmCh9Ei6MyUGIZXpiFt7F2 gqMAoOhf5H2uZHB7xhjecOcL0G3w/cqR =hFNz -END PGP SIGNATURE-
Re: [gentoo-portage-dev] Re: search functionality in emerge
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Emma Strubell schrieb: 2) does anyone really need to search an overlay anyway? Of course. Take large (semi-)official overlays like sunrise. They can easily be seen as a second portage tree. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S =+lCO -END PGP SIGNATURE-
Re: [gentoo-portage-dev] Time to say goodbye
On Sun, 2008-11-30 at 16:19 +0100, Marius Mauch wrote: So, time has come for me to realize that my time with Gentoo is over. I haven't actually been doing much Gentoo work over the last months due to personal reasons (nothing Gentoo related), and I don't see that situation changing in the near future. In fact I've already reassigned or dropped most of my responsibilites in Gentoo a while ago, so there are just a few pet projects left to give away: - my gentoo-stats project (in the portage/gentoo-stats svn repository). I know quite a few people are interested in the idea of collecting various statistic data from gentoo user systems, and I'd encourage everyone who wants to implement such a system to at least look at it (I may have even finished it if I wouldn't have wasted my time focusing on the wrong problems). There is quite a bit of documentation also that should help to get you started - a graphical security update tool (see bug #190397) So if anyone wants to adopt those, complete or just parts, just take them. As for Portage, Zac has practically already filled my role. So I guess that wraps it up. It's been a nice ride most of the time, but now it's time for me to leave the Gentoo train. Marius I will always remember you as the guy who provided us with the much needed glsa*.py (thank you again) Take care and I wish you the best in all your future endeavors. -- Ned Ludd [EMAIL PROTECTED] Gentoo Linux
Re: [gentoo-portage-dev] Re: search functionality in emerge
2008/12/2 Emma Strubell [EMAIL PROTECTED] True, true. Like I said, I don't really use overlays, so excuse my igonrance. Do you know an order of doing things: Rules of Optimization: - Rule 1: Don't do it. - Rule 2 (for experts only): Don't do it yet. What this actually means - functionality comes first. Readability comes next. Optimization comes last. Unless you are creating a fancy 3D engine for kung fu game. If you are going to exclude overlays, you are removing functionality - and, indeed, absolutely has-to-be-there functionality, because noone would intuitively expect search function to search only one subset of packages, however reasonable this subset would be. So, you can't, just can't, add this package into portage base - you could write just another external search package for portage. I looked this code a bit and: Portage's __init__.py contains comment # search functionality. After this comment, there is a nice and simple search class. It also contains method def action_sync(...), which contains synchronization stuff. Now, search class will be initialized by setting up 3 databases - porttree, bintree and vartree, whatever those are. Those will be in self._dbs array and porttree will be in self._portdb. It contains some more methods: _findname(...) will return result of self._portdb.findname(...) with same parameters or None if it does not exist. Other methods will do similar things - map one or another method. execute will do the real search... Now - for package in self.portdb.cp_all() is important here ...it currently loops over whole portage tree. All kinds of matching will be done inside. self.portdb obviously points to porttree.py (unless it points to fake tree). cp_all will take all porttrees and do simple file search inside. This method should contain optional index search. self.porttrees = [self.porttree_root] + \ [os.path.realpath(t) for t in self.mysettings[PORTDIR_OVERLAY].split()] So, self.porttrees contains list of trees - first of them is root, others are overlays. Now, what you have to do will not be harder just because of having overlay search, too. You have to create method def cp_index(self), which will return dictionary containing package names as keys. For oroot... will be self.porttrees[1:], not self.porttrees - this will only search overlays. d = {} will be replaced with d = self.cp_index(). If index is not there, old version will be used (thus, you have to make internal porttrees variable, which contains all or all except first). Other methods used by search are xmatch and aux_get - first used several times and last one used to get description. You have to cache results of those specific queries and make them use your cache - as you can see, those parts of portage are already able to use overlays. Thus, you have to put your code again in beginning of those functions - create index_xmatch and index_aux_get methods, then make those methods use them and return their results unless those are None (or something other in case none is already legal result) - if they return None, old code will be run and do it's job. If index is not created, result is None. In index_** methods, just check if query is what you can answer and if it is, then answer it. Obviously, the simplest way to create your index is to delete index, then use those same methods to query for all nessecary information - and fastest way would be to add updating index directly into sync, which you could do later. Please, also, make those commands to turn index on and off (last one should also delete it to save disk space). Default should be off until it's fast, small and reliable. Also notice that if index is kept on hard drive, it might be faster if it's compressed (gz, for example) - decompressing takes less time and more processing power than reading it fully out. Have luck! -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Emma Strubell schrieb: 2) does anyone really need to search an overlay anyway? Of course. Take large (semi-)official overlays like sunrise. They can easily be seen as a second portage tree. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S =+lCO -END PGP SIGNATURE- On Mon, Dec 1, 2008 at 5:17 PM, René 'Necoro' Neumann [EMAIL PROTECTED]wrote:
Re: [gentoo-portage-dev] Re: search functionality in emerge
yes, yes, i know, you're right :] and thanks a bunch for the outline! about the compression, I agree that it would be a good idea, but I don't know how to implement it. not that it would be difficult... I'm guessing there's a gzip module for python that would make it pretty straightforward? I think I'm getting ahead of myself, though. I haven't even implemented the suffix tree yet! Emma On Mon, Dec 1, 2008 at 7:20 PM, Tambet [EMAIL PROTECTED] wrote: 2008/12/2 Emma Strubell [EMAIL PROTECTED] True, true. Like I said, I don't really use overlays, so excuse my igonrance. Do you know an order of doing things: Rules of Optimization: - Rule 1: Don't do it. - Rule 2 (for experts only): Don't do it yet. What this actually means - functionality comes first. Readability comes next. Optimization comes last. Unless you are creating a fancy 3D engine for kung fu game. If you are going to exclude overlays, you are removing functionality - and, indeed, absolutely has-to-be-there functionality, because noone would intuitively expect search function to search only one subset of packages, however reasonable this subset would be. So, you can't, just can't, add this package into portage base - you could write just another external search package for portage. I looked this code a bit and: Portage's __init__.py contains comment # search functionality. After this comment, there is a nice and simple search class. It also contains method def action_sync(...), which contains synchronization stuff. Now, search class will be initialized by setting up 3 databases - porttree, bintree and vartree, whatever those are. Those will be in self._dbs array and porttree will be in self._portdb. It contains some more methods: _findname(...) will return result of self._portdb.findname(...) with same parameters or None if it does not exist. Other methods will do similar things - map one or another method. execute will do the real search... Now - for package in self.portdb.cp_all() is important here ...it currently loops over whole portage tree. All kinds of matching will be done inside. self.portdb obviously points to porttree.py (unless it points to fake tree). cp_all will take all porttrees and do simple file search inside. This method should contain optional index search. self.porttrees = [self.porttree_root] + \ [os.path.realpath(t) for t in self.mysettings[PORTDIR_OVERLAY].split()] So, self.porttrees contains list of trees - first of them is root, others are overlays. Now, what you have to do will not be harder just because of having overlay search, too. You have to create method def cp_index(self), which will return dictionary containing package names as keys. For oroot... will be self.porttrees[1:], not self.porttrees - this will only search overlays. d = {} will be replaced with d = self.cp_index(). If index is not there, old version will be used (thus, you have to make internal porttrees variable, which contains all or all except first). Other methods used by search are xmatch and aux_get - first used several times and last one used to get description. You have to cache results of those specific queries and make them use your cache - as you can see, those parts of portage are already able to use overlays. Thus, you have to put your code again in beginning of those functions - create index_xmatch and index_aux_get methods, then make those methods use them and return their results unless those are None (or something other in case none is already legal result) - if they return None, old code will be run and do it's job. If index is not created, result is None. In index_** methods, just check if query is what you can answer and if it is, then answer it. Obviously, the simplest way to create your index is to delete index, then use those same methods to query for all nessecary information - and fastest way would be to add updating index directly into sync, which you could do later. Please, also, make those commands to turn index on and off (last one should also delete it to save disk space). Default should be off until it's fast, small and reliable. Also notice that if index is kept on hard drive, it might be faster if it's compressed (gz, for example) - decompressing takes less time and more processing power than reading it fully out. Have luck! -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Emma Strubell schrieb: 2) does anyone really need to search an overlay anyway? Of course. Take large (semi-)official overlays like sunrise. They can easily be seen as a second portage tree. -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S =+lCO -END PGP SIGNATURE- On Mon, Dec 1, 2008 at 5:17 PM, René 'Necoro' Neumann [EMAIL