Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-21 Thread Alec Warner
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Brian Harring wrote:
> On Tue, Sep 20, 2005 at 06:55:44PM -0500, Paul Varner wrote:
> 
>>On Tue, 2005-09-20 at 18:34 -0500, Brian Harring wrote:
>>
Updated patch to add a semaphore to control access to the global
portage.config object. Unless anyone sees any other issues with this
patch, I will be placing it into gentoolkit.
>>>
>>>Reason for a semaphore over threading.Lock ?
>>
>>No reason other than I'm used to thinking of them as semaphores instead
>>of locks, so semaphore is what I searched for in the Python docs.
> 
> Need to make some chunks of the rewrite thread safe, so I poked a bit;
> python -m timeit -n 1 -r 3 -s 'import threading;s=threading.Lock()' 
> 's.acquire();s.release()'
> around 1us
> python -m timeit -n 1 -r 3 -s 'import threading;s=threading.Semaphore()' 
> 's.acquire();s.release()'
> 20us
> 
> Granted... It's being anal.  Just surprised me, and figured I'd poke 
> to find out if you had a specific reason for using sem over the 
> simpler (pretty much straight cpython) Lock :)
> ~harring

I haven't looked at the code, but I would gather that threading.lock is
what most would call a Mutex.  Semaphore's generally allow N number of
things to do a task ( say have 6 readers on an object, wait on semaphore
until all readers release, then write ).  Usually a performance if there
are more reads vs writes on your objects.

Thus the Lock is really a semaphore -> N = 1 which allows the kernel to
do some fun optimizing on it's operations.

Of course this is all conjecture since none of the __doc__ strings are
helpful :P
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQzFwF2zglR5RwbyYAQKP5g/+Og6ptvOFp+p4dWXBkKrgPEf/HjDP9Mdb
uinNZtytWu4myvDH+nNzz91gFrXQ9vyH6nSv62RZg20jOnAXGEkBZJAnhujteu5i
p9/ljSUphVvUHvzgixjGsJtKrbDNh72gwakniO57ZrjolKBxV3EZL87aJmoTd3Ye
j9h07mKzla8yOS8rswAS+uSnwWNEHoNbQtcVXnhFw0hhsyasi74Vwm2zoRZs6qgS
EHozcNmP0djn5pJPt5LNXhAOrd9myR/pkHQGAUKzvcpgfaRAFgflTJmrPqRAi61F
J90+0ZnxSBtplOTifkrOQxAtr///ZEHoe9Hfp5Dg/WInau6FODsEmjZ0j5g+F/6U
1V+50s4EMt4qcoGLPsGPj6HEy+M0SLF/Aie1VZesx/IwWf4jyy2Z6iHknM6WDGbh
1d5c73f9r/2DiV7GdkwpgoTxIAb4i/IcUQSqhB2O4vaJOvLubeJG4UTGPc/AxpEu
Vx1Ruvok6xDMOO5W7umYBpcD93ZBBRvdZiRj2uZVIl+AOM9aCejc+sW3KT7J+YLS
zlyIkfI29IJdt5QXYweMIX1QiLQSZZQTDKuDmeoqWVm7g48u5f/JLR9wBP+eJluw
4T92h5781vExy7TngGD5qGFMHwP8MMMi4UPB4tp6fLTEqkJeasa79LZrRHTEtNqQ
etlk6kngm50=
=P0ck
-END PGP SIGNATURE-
-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-20 Thread Paul Varner
On Tue, 2005-09-20 at 19:00 -0500, Brian Harring wrote:
> On Tue, Sep 20, 2005 at 06:55:44PM -0500, Paul Varner wrote:
> > On Tue, 2005-09-20 at 18:34 -0500, Brian Harring wrote:
> > > > Updated patch to add a semaphore to control access to the global
> > > > portage.config object. Unless anyone sees any other issues with this
> > > > patch, I will be placing it into gentoolkit.
> > > Reason for a semaphore over threading.Lock ?
> > 
> > No reason other than I'm used to thinking of them as semaphores instead
> > of locks, so semaphore is what I searched for in the Python docs.
> Need to make some chunks of the rewrite thread safe, so I poked a bit;
> python -m timeit -n 1 -r 3 -s 'import threading;s=threading.Lock()' 
> 's.acquire();s.release()'
> around 1us
> python -m timeit -n 1 -r 3 -s 'import threading;s=threading.Semaphore()' 
> 's.acquire();s.release()'
> 20us
> 

Okay, I've changed the Semaphore to a Lock and removed the
self._settings.reset() call.  

Anything else before I commit this?

Regards,
Paul

PS: As an aside the command 'equery hasuse perl' goes from 34s to 2s on
my Pentium 4 with this patch.
-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-20 Thread Brian Harring
On Tue, Sep 20, 2005 at 06:55:44PM -0500, Paul Varner wrote:
> On Tue, 2005-09-20 at 18:34 -0500, Brian Harring wrote:
> > > Updated patch to add a semaphore to control access to the global
> > > portage.config object. Unless anyone sees any other issues with this
> > > patch, I will be placing it into gentoolkit.
> > Reason for a semaphore over threading.Lock ?
> 
> No reason other than I'm used to thinking of them as semaphores instead
> of locks, so semaphore is what I searched for in the Python docs.
Need to make some chunks of the rewrite thread safe, so I poked a bit;
python -m timeit -n 1 -r 3 -s 'import threading;s=threading.Lock()' 
's.acquire();s.release()'
around 1us
python -m timeit -n 1 -r 3 -s 'import threading;s=threading.Semaphore()' 
's.acquire();s.release()'
20us

Granted... It's being anal.  Just surprised me, and figured I'd poke 
to find out if you had a specific reason for using sem over the 
simpler (pretty much straight cpython) Lock :)
~harring


pgpIo2YQGJJ18.pgp
Description: PGP signature


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-20 Thread Paul Varner
On Wed, 2005-09-21 at 08:46 +0900, Jason Stubbs wrote:
> > Updated patch to add a semaphore to control access to the global
> > portage.config object. Unless anyone sees any other issues with this
> > patch, I will be placing it into gentoolkit.
> 
> Is the settingslock created in __init__.py accessible in package.py? It may 
> just be another gem of python I didn't know about, but wouldn't it be not 
> available from that scope?
> 

It's available due to the ugly "from gentoolkit import *" at the start
of the file.

> Also, the calling of self._settings.reset() would be a big performance hit. 
> setcpv() does almost nothing if the per-package settings of the passed cpv 
> are identical to those of the previous. reset() on the other hand restacks 
> all configuration every time.

I hadn't noticed a big performance hit, but it is easily removed.

Regards,
Paul
-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-20 Thread Paul Varner
On Tue, 2005-09-20 at 18:34 -0500, Brian Harring wrote:
> > Updated patch to add a semaphore to control access to the global
> > portage.config object. Unless anyone sees any other issues with this
> > patch, I will be placing it into gentoolkit.
> Reason for a semaphore over threading.Lock ?

No reason other than I'm used to thinking of them as semaphores instead
of locks, so semaphore is what I searched for in the Python docs.

Paul
-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-20 Thread Jason Stubbs
On Wednesday 21 September 2005 02:37, Paul Varner wrote:
> On Fri, 2005-09-16 at 11:59 -0500, Paul Varner wrote:
> > http://bugs.gentoo.org/show_bug.cgi?id=90680
> >
> > Author: Paul Varner
> >
> > The current implementation of gentoolkit creates a portage.config
> > object for every package object that it creates. While this is the
> > correct thing to do from an object-oriented programming point of view,
> > this implementation consumes an excessive amount of memory and CPU. 
> > The proposed patch changes the portage.config object for each package
> > object to point to a single global object.
>
> Updated patch to add a semaphore to control access to the global
> portage.config object. Unless anyone sees any other issues with this
> patch, I will be placing it into gentoolkit.

Is the settingslock created in __init__.py accessible in package.py? It may 
just be another gem of python I didn't know about, but wouldn't it be not 
available from that scope?

Also, the calling of self._settings.reset() would be a big performance hit. 
setcpv() does almost nothing if the per-package settings of the passed cpv 
are identical to those of the previous. reset() on the other hand restacks 
all configuration every time.

-- 
Jason Stubbs


pgpfOMYUESex5.pgp
Description: PGP signature


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-20 Thread Brian Harring
On Tue, Sep 20, 2005 at 12:37:27PM -0500, Paul Varner wrote:
> On Fri, 2005-09-16 at 11:59 -0500, Paul Varner wrote:
> > http://bugs.gentoo.org/show_bug.cgi?id=90680
> > 
> > Author: Paul Varner
> > 
> > The current implementation of gentoolkit creates a portage.config object
> > for every package object that it creates. While this is the correct
> > thing to do from an object-oriented programming point of view, this
> > implementation consumes an excessive amount of memory and CPU.  The
> > proposed patch changes the portage.config object for each package object
> > to point to a single global object.
> > 
> 
> Updated patch to add a semaphore to control access to the global
> portage.config object. Unless anyone sees any other issues with this
> patch, I will be placing it into gentoolkit.
Reason for a semaphore over threading.Lock ?
~harring


pgpmp2V7TRpXm.pgp
Description: PGP signature


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-20 Thread Paul Varner
On Fri, 2005-09-16 at 11:59 -0500, Paul Varner wrote:
> http://bugs.gentoo.org/show_bug.cgi?id=90680
> 
> Author: Paul Varner
> 
> The current implementation of gentoolkit creates a portage.config object
> for every package object that it creates. While this is the correct
> thing to do from an object-oriented programming point of view, this
> implementation consumes an excessive amount of memory and CPU.  The
> proposed patch changes the portage.config object for each package object
> to point to a single global object.
> 

Updated patch to add a semaphore to control access to the global
portage.config object. Unless anyone sees any other issues with this
patch, I will be placing it into gentoolkit.

Regards,
Paul
diff -ur orig/gentoolkit/__init__.py new/gentoolkit/__init__.py
--- orig/gentoolkit/__init__.py	2005-06-07 16:18:08.0 -0500
+++ new/gentoolkit/__init__.py	2005-09-20 12:16:05.0 -0500
@@ -24,7 +24,9 @@
 import re
 import string
 import types
+from threading import Semaphore
 
+settingslock = Semaphore()
 settings = portage.config(clone=portage.settings)
 porttree = portage.db[portage.root]["porttree"]
 vartree  = portage.db[portage.root]["vartree"]
diff -ur orig/gentoolkit/package.py new/gentoolkit/package.py
--- orig/gentoolkit/package.py	2005-06-07 16:18:08.0 -0500
+++ new/gentoolkit/package.py	2005-09-20 12:16:30.0 -0500
@@ -23,8 +23,8 @@
 		if not self._scpv:
 			raise FatalError("invalid cpv: %s" % cpv)
 		self._db = None
-		settings.setcpv(self._cpv)
-		self._settings = portage.config(clone=settings)
+		self._settings = settings
+		self._settingslock = settingslock
 
 	def get_name(self):
 		"""Returns base name of package, no category nor version"""
@@ -44,7 +44,12 @@
 	def get_settings(self, key):
 		"""Returns the value of the given key for this package (useful 
 		for package.* files."""
-		return self._settings[key]
+		self._settingslock.acquire()
+		self._settings.setcpv(self._cpv)
+		v = self._settings[key]
+		self._settings.reset()
+		self._settingslock.release()
+		return v
 
 	def get_cpv(self):
 		"""Returns full Category/Package-Version string"""


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-19 Thread Paul de Vrieze
On Monday 19 September 2005 10:26, Jason Stubbs wrote:
> On Monday 19 September 2005 17:18, Paul de Vrieze wrote:
> > I doubt though that the config object should be modified.
>
> The Package object needs to call setcpv() on the config object to get
> at the per-package USE flags after they have been stacked.
>
> > But if it must in some cases a lazy copy scheme is probably most
> > efficient.
>
> Unfortunately not in this case. There would either be one config
> instance (for processes that don't deal with USE flags) or as many
> config instances as there are packages (for processes that do).

Should probably read the source a bit but you're completely right of 
course. The only thing that could be done for this is splitting up the 
object, or having it do internal sharing of shared things. Or to make it 
clear, it is not some cases, but almost all cases.

Paul

-- 
Paul de Vrieze
Gentoo Developer
Mail: [EMAIL PROTECTED]
Homepage: http://www.devrieze.net


pgplfCFHSRRMu.pgp
Description: PGP signature


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-19 Thread Jason Stubbs
On Monday 19 September 2005 17:18, Paul de Vrieze wrote:
> I doubt though that the config object should be modified.

The Package object needs to call setcpv() on the config object to get at the 
per-package USE flags after they have been stacked.

> But if it must in some cases a lazy copy scheme is probably most 
> efficient.

Unfortunately not in this case. There would either be one config instance 
(for processes that don't deal with USE flags) or as many config instances 
as there are packages (for processes that do).

-- 
Jason Stubbs


pgpbGBEWtpYiP.pgp
Description: PGP signature


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-19 Thread Paul de Vrieze
On Saturday 17 September 2005 02:32, Alec Warner wrote:
> Jason Stubbs wrote:
> > On Saturday 17 September 2005 01:59, Paul Varner wrote:
> >>http://bugs.gentoo.org/show_bug.cgi?id=90680
> >>
> >>Author: Paul Varner
> >>
> >>The current implementation of gentoolkit creates a portage.config
> >> object for every package object that it creates. While this is the
> >> correct thing to do from an object-oriented programming point of
> >> view, this implementation consumes an excessive amount of memory and
> >> CPU.  The proposed patch changes the portage.config object for each
> >> package object to point to a single global object.
> >>
> >>If no one sees any serious issues with the patch, I will be placing
> >> it into gentoolkit.
> >
> > I tried doing this once before locally, but found some issue with it.
> > Unfortunately, I can't remember what that issue was. If you are
> > calling setcpv() for every call to the package object that utilizes
> > the config object and no utilizing packages (in gentoolkit or
> > otherwise) are utilizing threading, it should theoretically be okay.
> > Actually, I think it was the threading issue that delayed the fix.
>
> I can't remember the model for this, but there is some logic along the
> lines of intercepting config object writes with setattr and then
> cloning the config object.  That way if the config is read-only only 1
> is instantiated, but if you attempt to modify it, the config would
> clone itself, then proceed with the modification and return the cloned
> copy. Not sure how easy that would be to implement, perhaps some sort
> of wrapper class?

To share such an object the right way from an OO perspective would require 
to pass the object along at package object instanciation. I doubt though 
that the config object should be modified. But if it must in some cases a 
lazy copy scheme is probably most efficient. You'd probably have the 
editing code do something like:
this->editableConfig()->changeAttribute(...)

Where the editableConfig function checks whether a copy has been made, if 
so, it will just use that copy, else it will make a copy and return it.

Paul


-- 
Paul de Vrieze
Gentoo Developer
Mail: [EMAIL PROTECTED]
Homepage: http://www.devrieze.net


pgpDlxwohvwF2.pgp
Description: PGP signature


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-16 Thread Paul Varner
On Sat, 2005-09-17 at 08:09 +0900, Jason Stubbs wrote:
> I tried doing this once before locally, but found some issue with it. 
> Unfortunately, I can't remember what that issue was. If you are calling 
> setcpv() for every call to the package object that utilizes the config 
> object and no utilizing packages (in gentoolkit or otherwise) are utilizing 
> threading, it should theoretically be okay. Actually, I think it was the 
> threading issue that delayed the fix.


(I just realize that I forgot to attach the patch, so the patch is
attached to this message)

The patch is definitely not thread safe. I know that equery and
gentoolkit do not utilize threads, but I can not speak for any other
third-party packages.

The quickest way to fix would probably be to implement a semaphore for
accessing the portage.config object.

Regards,
Paul
--- package.py.orig	2005-09-16 11:43:56.0 -0500
+++ package.py	2005-09-16 11:44:53.0 -0500
@@ -23,8 +23,7 @@
 		if not self._scpv:
 			raise FatalError("invalid cpv: %s" % cpv)
 		self._db = None
-		settings.setcpv(self._cpv)
-		self._settings = portage.config(clone=settings)
+		self._settings = settings
 
 	def get_name(self):
 		"""Returns base name of package, no category nor version"""
@@ -44,7 +43,10 @@
 	def get_settings(self, key):
 		"""Returns the value of the given key for this package (useful 
 		for package.* files."""
-		return self._settings[key]
+		self._settings.setcpv(self._cpv)
+		v = self._settings[key]
+		self._settings.reset()
+		return v
 
 	def get_cpv(self):
 		"""Returns full Category/Package-Version string"""


Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-16 Thread Alec Warner
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Jason Stubbs wrote:
> On Saturday 17 September 2005 01:59, Paul Varner wrote:
> 
>>http://bugs.gentoo.org/show_bug.cgi?id=90680
>>
>>Author: Paul Varner
>>
>>The current implementation of gentoolkit creates a portage.config object
>>for every package object that it creates. While this is the correct
>>thing to do from an object-oriented programming point of view, this
>>implementation consumes an excessive amount of memory and CPU.  The
>>proposed patch changes the portage.config object for each package object
>>to point to a single global object.
>>
>>If no one sees any serious issues with the patch, I will be placing it
>>into gentoolkit.
> 
> 
> I tried doing this once before locally, but found some issue with it. 
> Unfortunately, I can't remember what that issue was. If you are calling 
> setcpv() for every call to the package object that utilizes the config 
> object and no utilizing packages (in gentoolkit or otherwise) are utilizing 
> threading, it should theoretically be okay. Actually, I think it was the 
> threading issue that delayed the fix.
> 

I can't remember the model for this, but there is some logic along the
lines of intercepting config object writes with setattr and then cloning
the config object.  That way if the config is read-only only 1 is
instantiated, but if you attempt to modify it, the config would clone
itself, then proceed with the modification and return the cloned copy.
Not sure how easy that would be to implement, perhaps some sort of
wrapper class?

-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.1 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iQIVAwUBQytkOmzglR5RwbyYAQKDvw//Wz/TpPF35sf+yhz5dyQ09qN5Ug9qiTec
qDnf364k+lzUSXzzNEjhpgSpyyKcEv30CjZWTn+g3IoH7aQ2v2mIIF+5V0C/9kxP
JH0a+ibh+Mvzv28tteEE+VX4yTykNz8L5b6twwQO/53GfY8Dj/l19AqOrjgHOfGf
/nvslpqEiFmAzc0RdxMuOAiFHdcfZgZyQpHsHboTZ7kQQpFTX4h1nShNAkGiceBI
ucfIs/Fp0RunDUYK2pfIedbLKsfRKyySaUOXXPUrTiwufO7zbyvRZI/+mBLJlL6v
Ud49wak10G8dFNZA9J3LgWRoy2Dqxrp6Eu7gTp5U4ONKLSJo3CSVa+1P2eACmerl
RqlAaVxtz+lpaXoiTSNoFKzjrIgidT00d7kG0v/gbI0vsqVy31nEuaSqWgAwjBit
GeDV8A9Tnxe+UNyUkd4BpBX3p8bX4EyDUNqfv0OChuCXbozbc1yizksSps8+Y3iQ
uG29o90/ExvsRQVAFKhWBjc9aYgghMhTO8yrKGkp4eI+ewa82QhLnlhgdQRMLSqY
ox+IRPxClIVRLzUZ4m3BsITy3QdSqvVhrB7ITofIFjXSY0OPQivA0HSebnPDptlU
mPreE6NzWs7KGhjhBvatqKnM7CuzW9JogolXkk+vT/pvSWlAoPc48mAW77CmKO5G
q3754DSKb3A=
=uzjq
-END PGP SIGNATURE-
-- 
gentoo-portage-dev@gentoo.org mailing list



Re: [gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-16 Thread Jason Stubbs
On Saturday 17 September 2005 01:59, Paul Varner wrote:
> http://bugs.gentoo.org/show_bug.cgi?id=90680
>
> Author: Paul Varner
>
> The current implementation of gentoolkit creates a portage.config object
> for every package object that it creates. While this is the correct
> thing to do from an object-oriented programming point of view, this
> implementation consumes an excessive amount of memory and CPU.  The
> proposed patch changes the portage.config object for each package object
> to point to a single global object.
>
> If no one sees any serious issues with the patch, I will be placing it
> into gentoolkit.

I tried doing this once before locally, but found some issue with it. 
Unfortunately, I can't remember what that issue was. If you are calling 
setcpv() for every call to the package object that utilizes the config 
object and no utilizing packages (in gentoolkit or otherwise) are utilizing 
threading, it should theoretically be okay. Actually, I think it was the 
threading issue that delayed the fix.

-- 
Jason Stubbs


pgptcGguypjiz.pgp
Description: PGP signature


[gentoo-portage-dev] PATCH: gentoolkit: Make portage.config object a global object

2005-09-16 Thread Paul Varner
http://bugs.gentoo.org/show_bug.cgi?id=90680

Author: Paul Varner

The current implementation of gentoolkit creates a portage.config object
for every package object that it creates. While this is the correct
thing to do from an object-oriented programming point of view, this
implementation consumes an excessive amount of memory and CPU.  The
proposed patch changes the portage.config object for each package object
to point to a single global object.

If no one sees any serious issues with the patch, I will be placing it
into gentoolkit.

Regards,
Paul
-- 
gentoo-portage-dev@gentoo.org mailing list