yes, yes, i know, you're right :]

and thanks a bunch for the outline! about the compression, I agree that it
would be a good idea, but I don't know how to implement it. not that it
would be difficult... I'm guessing there's a gzip module for python that
would make it pretty straightforward? I think I'm getting ahead of myself,
though. I haven't even implemented the suffix tree yet!

Emma

On Mon, Dec 1, 2008 at 7:20 PM, Tambet <[EMAIL PROTECTED]> wrote:

> 2008/12/2 Emma Strubell <[EMAIL PROTECTED]>
>
>> True, true. Like I said, I don't really use overlays, so excuse my
>> igonrance.
>>
>
> Do you know an order of doing things:
>
> Rules of Optimization:
>
>    - Rule 1: Don't do it.
>    - Rule 2 (for experts only): Don't do it yet.
>
> What this actually means - functionality comes first. Readability comes
> next. Optimization comes last. Unless you are creating a fancy 3D engine for
> kung fu game.
>
> If you are going to exclude overlays, you are removing functionality - and,
> indeed, absolutely has-to-be-there functionality, because noone would
> intuitively expect search function to search only one subset of packages,
> however reasonable this subset would be. So, you can't, just can't, add this
> package into portage base - you could write just another external search
> package for portage.
>
> I looked this code a bit and:
> Portage's "__init__.py" contains comment "# search functionality". After
> this comment, there is a nice and simple search class.
> It also contains method "def action_sync(...)", which contains
> synchronization stuff.
>
> Now, search class will be initialized by setting up 3 databases - porttree,
> bintree and vartree, whatever those are. Those will be in self._dbs array
> and porttree will be in self._portdb.
>
> It contains some more methods:
> _findname(...) will return result of self._portdb.findname(...) with same
> parameters or None if it does not exist.
> Other methods will do similar things - map one or another method.
> execute will do the real search...
> Now - "for package in self.portdb.cp_all()" is important here ...it
> currently loops over whole portage tree. All kinds of matching will be done
> inside.
> self.portdb obviously points to porttree.py (unless it points to fake
> tree).
> cp_all will take all porttrees and do simple file search inside. This
> method should contain optional index search.
>
>               self.porttrees = [self.porttree_root] + \
>                       [os.path.realpath(t) for t in 
> self.mysettings["PORTDIR_OVERLAY"].split()]
>
> So, self.porttrees contains list of trees - first of them is root, others
> are overlays.
>
> Now, what you have to do will not be harder just because of having overlay
> search, too.
>
> You have to create method def cp_index(self), which will return dictionary
> containing package names as keys. For oroot... will be "self.porttrees[1:]",
> not "self.porttrees" - this will only search overlays. d = {} will be
> replaced with d = self.cp_index(). If index is not there, old version will
> be used (thus, you have to make internal porttrees variable, which contains
> all or all except first).
>
> Other methods used by search are xmatch and aux_get - first used several
> times and last one used to get description. You have to cache results of
> those specific queries and make them use your cache - as you can see, those
> parts of portage are already able to use overlays. Thus, you have to put
> your code again in beginning of those functions - create index_xmatch and
> index_aux_get methods, then make those methods use them and return their
> results unless those are None (or something other in case none is already
> legal result) - if they return None, old code will be run and do it's job.
> If index is not created, result is None. In index_** methods, just check if
> query is what you can answer and if it is, then answer it.
>
> Obviously, the simplest way to create your index is to delete index, then
> use those same methods to query for all nessecary information - and fastest
> way would be to add updating index directly into sync, which you could do
> later.
>
> Please, also, make those commands to turn index on and off (last one should
> also delete it to save disk space). Default should be off until it's fast,
> small and reliable. Also notice that if index is kept on hard drive, it
> might be faster if it's compressed (gz, for example) - decompressing takes
> less time and more processing power than reading it fully out.
>
> Have luck!
>
> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> Emma Strubell schrieb:
>>> > 2) does anyone really need to search an overlay anyway?
>>>
>>> Of course. Take large (semi-)official overlays like sunrise. They can
>>> easily be seen as a second portage tree.
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.9 (GNU/Linux)
>>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>>
>>> iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt
>>> 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S
>>> =+lCO
>>> -----END PGP SIGNATURE-----
>>>
>>> On Mon, Dec 1, 2008 at 5:17 PM, René 'Necoro' Neumann <[EMAIL 
>>> PROTECTED]>wrote:
>>
>>
>

Reply via email to