Re: [pool] equal instances
On 11/06/2011 18:18, Phil Steitz wrote: > On 6/11/11 2:54 AM, Mark Thomas wrote: >> On 11/06/2011 08:44, Phil Steitz wrote: >> I too think that the second option is the way to go using system >> identity hashcodes. I'm not sure that the first option will be faster >> since iterating through the List of circulating objects is likely to be >> slower than generating a system identity hashcode and doing a lookup in >> a Map. If testing pool2/dbcp2 shows that the system identity hashcode >> makes it significantly slower than Tomcat's jdbc-pool then we can look >> at this again. > > What I meant in the second comment was to basically leave the code > as is with the restriction that instances must be equals-discernible > by default and provide either a subclass or a config option that > would trigger the more complicated code to check identity hashes. Ah right. Sorry. I misunderstood. That could be a plan. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 6/11/11 4:31 AM, Mark Thomas wrote: > On 11/06/2011 12:26, Honton, Charles wrote: >> Please look at java.util.IdentityHashMap again. I think it does >> everything you're looking for except concurrent access. Given other >> requirements, I'm not sure a ConcurrentHashMap will buy much performance >> anyway. > I suggest you take a look in the archives. Synchronisation has a major > negative impact on performance. IdentityHashMap is completely the wrong > solution for this use case. > > Mark > > >> chas >> >> >> On 6/11/11 5:54 AM, "Mark Thomas" wrote: >> >>> On 11/06/2011 08:44, Phil Steitz wrote: I have looked at this some more and I think we have two choices, depending on how much we want to be able to do in terms of monitoring and instance management. The simplest solution is to enable only holding references to instances checked out to clients, but no tracking of last active times, etc. by the pool itself (leaving this to the pooled objects themselves, as AbandonedObjectPool does now). That could be done with less ambitious _allObjects and PoolableObject classes. Actually, _allObjects would go away, replaced by a List of _circulatingObjects (like AbandonedObjectPool trace) and PoolableObject would really only maintain state for idle instances or instances undergoing lifecycle transitions. >>> The problem with a List and objects where A.equals(B) is that we'll have >>> to iterate through the list testing objects for equality when an object >>> is returned in order to remove the right object. That will be slow. (If >>> we want to prevent the same object being returned multiple times). >>> If we want to be able to track things like last active time (as the trunk code now enables), we need to be able to identify returning objects uniquely. That means either we impose the factory discernibility requirement or use system identity hashcodes. I would like to keep the monitoring / management options open, so my vote is for the second option. I wonder, though, if it makes sense to keep the base implementation simple (and a little faster) and restrictive and provide the equals-tolerant functionality in a subclass or via a config option. >>> I too think that the second option is the way to go using system >>> identity hashcodes. I'm not sure that the first option will be faster >>> since iterating through the List of circulating objects is likely to be >>> slower than generating a system identity hashcode and doing a lookup in >>> a Map. If testing pool2/dbcp2 shows that the system identity hashcode >>> makes it significantly slower than Tomcat's jdbc-pool then we can look >>> at this again. Speaking of that, you will notice that [performance] is now set up to handle both dbcp2 and tomcat jdbc-pool and a mock driver. I would strongly encourage testing with variable load profiles as [performance] can generate rather than the primitive threads-in-a-loop setup in the jdbc-pool unit tests. To really test the pool performance, you need the pool to grow and shrink. That will not happen with constant load. Phil >>> Mark >>> >>> >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 6/11/11 2:54 AM, Mark Thomas wrote: > On 11/06/2011 08:44, Phil Steitz wrote: >> I have looked at this some more and I think we have two choices, >> depending on how much we want to be able to do in terms of >> monitoring and instance management. The simplest solution is to >> enable only holding references to instances checked out to clients, >> but no tracking of last active times, etc. by the pool itself >> (leaving this to the pooled objects themselves, as >> AbandonedObjectPool does now). That could be done with less >> ambitious _allObjects and PoolableObject classes. Actually, >> _allObjects would go away, replaced by a List of _circulatingObjects >> (like AbandonedObjectPool trace) and PoolableObject would really >> only maintain state for idle instances or instances undergoing >> lifecycle transitions. > The problem with a List and objects where A.equals(B) is that we'll have > to iterate through the list testing objects for equality when an object > is returned in order to remove the right object. That will be slow. (If > we want to prevent the same object being returned multiple times). Agreed. >> If we want to be able to track things like last active time (as the >> trunk code now enables), we need to be able to identify returning >> objects uniquely. That means either we impose the factory >> discernibility requirement or use system identity hashcodes. >> >> I would like to keep the monitoring / management options open, so my >> vote is for the second option. I wonder, though, if it makes sense >> to keep the base implementation simple (and a little faster) and >> restrictive and provide the equals-tolerant functionality in a >> subclass or via a config option. > I too think that the second option is the way to go using system > identity hashcodes. I'm not sure that the first option will be faster > since iterating through the List of circulating objects is likely to be > slower than generating a system identity hashcode and doing a lookup in > a Map. If testing pool2/dbcp2 shows that the system identity hashcode > makes it significantly slower than Tomcat's jdbc-pool then we can look > at this again. What I meant in the second comment was to basically leave the code as is with the restriction that instances must be equals-discernible by default and provide either a subclass or a config option that would trigger the more complicated code to check identity hashes. Phil > Mark > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 11/06/2011 12:26, Honton, Charles wrote: > Please look at java.util.IdentityHashMap again. I think it does > everything you're looking for except concurrent access. Given other > requirements, I'm not sure a ConcurrentHashMap will buy much performance > anyway. I suggest you take a look in the archives. Synchronisation has a major negative impact on performance. IdentityHashMap is completely the wrong solution for this use case. Mark > chas > > > On 6/11/11 5:54 AM, "Mark Thomas" wrote: > >> On 11/06/2011 08:44, Phil Steitz wrote: >>> I have looked at this some more and I think we have two choices, >>> depending on how much we want to be able to do in terms of >>> monitoring and instance management. The simplest solution is to >>> enable only holding references to instances checked out to clients, >>> but no tracking of last active times, etc. by the pool itself >>> (leaving this to the pooled objects themselves, as >>> AbandonedObjectPool does now). That could be done with less >>> ambitious _allObjects and PoolableObject classes. Actually, >>> _allObjects would go away, replaced by a List of _circulatingObjects >>> (like AbandonedObjectPool trace) and PoolableObject would really >>> only maintain state for idle instances or instances undergoing >>> lifecycle transitions. >> >> The problem with a List and objects where A.equals(B) is that we'll have >> to iterate through the list testing objects for equality when an object >> is returned in order to remove the right object. That will be slow. (If >> we want to prevent the same object being returned multiple times). >> >>> If we want to be able to track things like last active time (as the >>> trunk code now enables), we need to be able to identify returning >>> objects uniquely. That means either we impose the factory >>> discernibility requirement or use system identity hashcodes. >>> >>> I would like to keep the monitoring / management options open, so my >>> vote is for the second option. I wonder, though, if it makes sense >>> to keep the base implementation simple (and a little faster) and >>> restrictive and provide the equals-tolerant functionality in a >>> subclass or via a config option. >> >> I too think that the second option is the way to go using system >> identity hashcodes. I'm not sure that the first option will be faster >> since iterating through the List of circulating objects is likely to be >> slower than generating a system identity hashcode and doing a lookup in >> a Map. If testing pool2/dbcp2 shows that the system identity hashcode >> makes it significantly slower than Tomcat's jdbc-pool then we can look >> at this again. >> >> Mark >> >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
Please look at java.util.IdentityHashMap again. I think it does everything you're looking for except concurrent access. Given other requirements, I'm not sure a ConcurrentHashMap will buy much performance anyway. chas On 6/11/11 5:54 AM, "Mark Thomas" wrote: >On 11/06/2011 08:44, Phil Steitz wrote: >> I have looked at this some more and I think we have two choices, >> depending on how much we want to be able to do in terms of >> monitoring and instance management. The simplest solution is to >> enable only holding references to instances checked out to clients, >> but no tracking of last active times, etc. by the pool itself >> (leaving this to the pooled objects themselves, as >> AbandonedObjectPool does now). That could be done with less >> ambitious _allObjects and PoolableObject classes. Actually, >> _allObjects would go away, replaced by a List of _circulatingObjects >> (like AbandonedObjectPool trace) and PoolableObject would really >> only maintain state for idle instances or instances undergoing >> lifecycle transitions. > >The problem with a List and objects where A.equals(B) is that we'll have >to iterate through the list testing objects for equality when an object >is returned in order to remove the right object. That will be slow. (If >we want to prevent the same object being returned multiple times). > >> If we want to be able to track things like last active time (as the >> trunk code now enables), we need to be able to identify returning >> objects uniquely. That means either we impose the factory >> discernibility requirement or use system identity hashcodes. >> >> I would like to keep the monitoring / management options open, so my >> vote is for the second option. I wonder, though, if it makes sense >> to keep the base implementation simple (and a little faster) and >> restrictive and provide the equals-tolerant functionality in a >> subclass or via a config option. > >I too think that the second option is the way to go using system >identity hashcodes. I'm not sure that the first option will be faster >since iterating through the List of circulating objects is likely to be >slower than generating a system identity hashcode and doing a lookup in >a Map. If testing pool2/dbcp2 shows that the system identity hashcode >makes it significantly slower than Tomcat's jdbc-pool then we can look >at this again. > >Mark > > > >- >To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 11/06/2011 08:44, Phil Steitz wrote: > I have looked at this some more and I think we have two choices, > depending on how much we want to be able to do in terms of > monitoring and instance management. The simplest solution is to > enable only holding references to instances checked out to clients, > but no tracking of last active times, etc. by the pool itself > (leaving this to the pooled objects themselves, as > AbandonedObjectPool does now). That could be done with less > ambitious _allObjects and PoolableObject classes. Actually, > _allObjects would go away, replaced by a List of _circulatingObjects > (like AbandonedObjectPool trace) and PoolableObject would really > only maintain state for idle instances or instances undergoing > lifecycle transitions. The problem with a List and objects where A.equals(B) is that we'll have to iterate through the list testing objects for equality when an object is returned in order to remove the right object. That will be slow. (If we want to prevent the same object being returned multiple times). > If we want to be able to track things like last active time (as the > trunk code now enables), we need to be able to identify returning > objects uniquely. That means either we impose the factory > discernibility requirement or use system identity hashcodes. > > I would like to keep the monitoring / management options open, so my > vote is for the second option. I wonder, though, if it makes sense > to keep the base implementation simple (and a little faster) and > restrictive and provide the equals-tolerant functionality in a > subclass or via a config option. I too think that the second option is the way to go using system identity hashcodes. I'm not sure that the first option will be faster since iterating through the List of circulating objects is likely to be slower than generating a system identity hashcode and doing a lookup in a Map. If testing pool2/dbcp2 shows that the system identity hashcode makes it significantly slower than Tomcat's jdbc-pool then we can look at this again. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 6/10/11 2:38 PM, Mark Thomas wrote: > On 10/06/2011 22:33, Gary Gregory wrote: >> On Jun 10, 2011, at 13:47, Mark Thomas wrote: >> >>> On 09/06/2011 17:47, Phil Steitz wrote: I guess I "want it both ways" here. We already have a use case for tracking instances - abandoned connection tracking in DBCP and *lots* of requests and practical uses for holding references to instances in circulation (mostly from dbcp, but I recall some from people managing socket pools or message queue pools). I know I flip/flopped on this - initially thinking it was a reasonable expectation for factories to produce equals-discernible instances. But I am afraid it will be the source of hard to find bugs and I can see the argument on the other side - i.e., a pool should be perfectly happy managing indiscernible objects. I never agreed with Leibniz [1] - he he. As you pointed out, Mark, 1.x gets around this problem by wrapping / unwrapping but that strategy makes identity-checking awkward. I will think about this some more, but a hybrid strategy where _allObjects becomes a HashBag (steal from [collections]) used for quick checking and we add _allInstances to hold wrapped instances. Could be nonsense, but something along these lines might work. >>> I've done some testing and under heavy load (10 threads concurrently >>> creating 100 objects each) on a 8 core machine I saw zero >>> duplicates. With this in mind I think I can do the following for pool 2: >>> - have calls to PoolableObjectFactory#makeObject() check >>> System#idenityHashcode() and if a duplicate is detected discard that >>> object and ask the factory for a new one; >> With a retry limit to avoid infinite loops. > Indeed. > >> Is this a feature that should be in a subclass or a toggle? Or do we >> want it baked it? > I think it is going to have to be baked in. We have requirements that > mean we need to track all the objects and we need a scalable solution > for doing that. However we slice it, that is going to come down to > hashes and we need to prevent collisions. > > Mark > >> Gary >> >>> - use some simple wrapper based on identityHashcode around objects in >>> the idle object pool and the all objects map >>> - when an object is returned, use identityHashcode to map it back to the >>> right object in the all object pool >>> >>> The requirement this places on PoolableObjectFactories is that they must >>> produce objects with identityHashcodes that only rarely have collisions. >>> Since this is under JVM control and my test case shows this is rare, I'm >>> happy with this. >>> The important thing at this point is to agree on what invariants we expect [pool] and user factories to maintain. Since I have already changed my mind once on this, and I understand the practical arguments in favor of staying with the setup now in trunk (equal instances not allowed), I would like to hear what others have to say on this. >>> I think we can have it both ways this time :) >>> >>> I'll leave this a little while before I do any coding to give folks a >>> chance to comment. I have looked at this some more and I think we have two choices, depending on how much we want to be able to do in terms of monitoring and instance management. The simplest solution is to enable only holding references to instances checked out to clients, but no tracking of last active times, etc. by the pool itself (leaving this to the pooled objects themselves, as AbandonedObjectPool does now). That could be done with less ambitious _allObjects and PoolableObject classes. Actually, _allObjects would go away, replaced by a List of _circulatingObjects (like AbandonedObjectPool trace) and PoolableObject would really only maintain state for idle instances or instances undergoing lifecycle transitions. If we want to be able to track things like last active time (as the trunk code now enables), we need to be able to identify returning objects uniquely. That means either we impose the factory discernibility requirement or use system identity hashcodes. I would like to keep the monitoring / management options open, so my vote is for the second option. I wonder, though, if it makes sense to keep the base implementation simple (and a little faster) and restrictive and provide the equals-tolerant functionality in a subclass or via a config option. Phil >>> Mark >>> >>> >>> >>> - >>> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >>> For additional commands, e-mail: dev-h...@commons.apache.org >>> >> - >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > > > > - > To unsubscribe, e-mail: dev
Re: [pool] equal instances
On 10/06/2011 22:33, Gary Gregory wrote: > On Jun 10, 2011, at 13:47, Mark Thomas wrote: > >> On 09/06/2011 17:47, Phil Steitz wrote: >>> I guess I "want it both ways" here. We already have a use case for >>> tracking instances - abandoned connection tracking in DBCP and >>> *lots* of requests and practical uses for holding references to >>> instances in circulation (mostly from dbcp, but I recall some from >>> people managing socket pools or message queue pools). >>> >>> I know I flip/flopped on this - initially thinking it was a >>> reasonable expectation for factories to produce equals-discernible >>> instances. But I am afraid it will be the source of hard to find >>> bugs and I can see the argument on the other side - i.e., a pool >>> should be perfectly happy managing indiscernible objects. I never >>> agreed with Leibniz [1] - he he. >>> >>> As you pointed out, Mark, 1.x gets around this problem by wrapping / >>> unwrapping but that strategy makes identity-checking awkward. I >>> will think about this some more, but a hybrid strategy where >>> _allObjects becomes a HashBag (steal from [collections]) used for >>> quick checking and we add _allInstances to hold wrapped instances. >>> Could be nonsense, but something along these lines might work. >> >> I've done some testing and under heavy load (10 threads concurrently >> creating 100 objects each) on a 8 core machine I saw zero >> duplicates. With this in mind I think I can do the following for pool 2: >> - have calls to PoolableObjectFactory#makeObject() check >> System#idenityHashcode() and if a duplicate is detected discard that >> object and ask the factory for a new one; > > With a retry limit to avoid infinite loops. Indeed. > Is this a feature that should be in a subclass or a toggle? Or do we > want it baked it? I think it is going to have to be baked in. We have requirements that mean we need to track all the objects and we need a scalable solution for doing that. However we slice it, that is going to come down to hashes and we need to prevent collisions. Mark > Gary > >> - use some simple wrapper based on identityHashcode around objects in >> the idle object pool and the all objects map >> - when an object is returned, use identityHashcode to map it back to the >> right object in the all object pool >> >> The requirement this places on PoolableObjectFactories is that they must >> produce objects with identityHashcodes that only rarely have collisions. >> Since this is under JVM control and my test case shows this is rare, I'm >> happy with this. >> >>> The important thing at this point is to agree on what invariants we >>> expect [pool] and user factories to maintain. Since I have already >>> changed my mind once on this, and I understand the practical >>> arguments in favor of staying with the setup now in trunk (equal >>> instances not allowed), I would like to hear what others have to say >>> on this. >> >> I think we can have it both ways this time :) >> >> I'll leave this a little while before I do any coding to give folks a >> chance to comment. >> >> Mark >> >> >> >> - >> To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org >> For additional commands, e-mail: dev-h...@commons.apache.org >> > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On Jun 10, 2011, at 13:47, Mark Thomas wrote: > On 09/06/2011 17:47, Phil Steitz wrote: >> I guess I "want it both ways" here. We already have a use case for >> tracking instances - abandoned connection tracking in DBCP and >> *lots* of requests and practical uses for holding references to >> instances in circulation (mostly from dbcp, but I recall some from >> people managing socket pools or message queue pools). >> >> I know I flip/flopped on this - initially thinking it was a >> reasonable expectation for factories to produce equals-discernible >> instances. But I am afraid it will be the source of hard to find >> bugs and I can see the argument on the other side - i.e., a pool >> should be perfectly happy managing indiscernible objects. I never >> agreed with Leibniz [1] - he he. >> >> As you pointed out, Mark, 1.x gets around this problem by wrapping / >> unwrapping but that strategy makes identity-checking awkward. I >> will think about this some more, but a hybrid strategy where >> _allObjects becomes a HashBag (steal from [collections]) used for >> quick checking and we add _allInstances to hold wrapped instances. >> Could be nonsense, but something along these lines might work. > > I've done some testing and under heavy load (10 threads concurrently > creating 100 objects each) on a 8 core machine I saw zero > duplicates. With this in mind I think I can do the following for pool 2: > - have calls to PoolableObjectFactory#makeObject() check > System#idenityHashcode() and if a duplicate is detected discard that > object and ask the factory for a new one; With a retry limit to avoid infinite loops. Is this a feature that should be in a subclass or a toggle? Or do we want it baked it? Gary > - use some simple wrapper based on identityHashcode around objects in > the idle object pool and the all objects map > - when an object is returned, use identityHashcode to map it back to the > right object in the all object pool > > The requirement this places on PoolableObjectFactories is that they must > produce objects with identityHashcodes that only rarely have collisions. > Since this is under JVM control and my test case shows this is rare, I'm > happy with this. > >> The important thing at this point is to agree on what invariants we >> expect [pool] and user factories to maintain. Since I have already >> changed my mind once on this, and I understand the practical >> arguments in favor of staying with the setup now in trunk (equal >> instances not allowed), I would like to hear what others have to say >> on this. > > I think we can have it both ways this time :) > > I'll leave this a little while before I do any coding to give folks a > chance to comment. > > Mark > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 6/10/11 10:46 AM, Mark Thomas wrote: > On 09/06/2011 17:47, Phil Steitz wrote: >> I guess I "want it both ways" here. We already have a use case for >> tracking instances - abandoned connection tracking in DBCP and >> *lots* of requests and practical uses for holding references to >> instances in circulation (mostly from dbcp, but I recall some from >> people managing socket pools or message queue pools). >> >> I know I flip/flopped on this - initially thinking it was a >> reasonable expectation for factories to produce equals-discernible >> instances. But I am afraid it will be the source of hard to find >> bugs and I can see the argument on the other side - i.e., a pool >> should be perfectly happy managing indiscernible objects. I never >> agreed with Leibniz [1] - he he. >> >> As you pointed out, Mark, 1.x gets around this problem by wrapping / >> unwrapping but that strategy makes identity-checking awkward. I >> will think about this some more, but a hybrid strategy where >> _allObjects becomes a HashBag (steal from [collections]) used for >> quick checking and we add _allInstances to hold wrapped instances. >> Could be nonsense, but something along these lines might work. > I've done some testing and under heavy load (10 threads concurrently > creating 100 objects each) on a 8 core machine I saw zero > duplicates. With this in mind I think I can do the following for pool 2: > - have calls to PoolableObjectFactory#makeObject() check > System#idenityHashcode() and if a duplicate is detected discard that > object and ask the factory for a new one; > - use some simple wrapper based on identityHashcode around objects in > the idle object pool and the all objects map > - when an object is returned, use identityHashcode to map it back to the > right object in the all object pool > > The requirement this places on PoolableObjectFactories is that they must > produce objects with identityHashcodes that only rarely have collisions. > Since this is under JVM control and my test case shows this is rare, I'm > happy with this. > >> The important thing at this point is to agree on what invariants we >> expect [pool] and user factories to maintain. Since I have already >> changed my mind once on this, and I understand the practical >> arguments in favor of staying with the setup now in trunk (equal >> instances not allowed), I would like to hear what others have to say >> on this. > I think we can have it both ways this time :) Sweet! > I'll leave this a little while before I do any coding to give folks a > chance to comment. I will play with this a little this evening too. Phil > Mark > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 09/06/2011 17:47, Phil Steitz wrote: > I guess I "want it both ways" here. We already have a use case for > tracking instances - abandoned connection tracking in DBCP and > *lots* of requests and practical uses for holding references to > instances in circulation (mostly from dbcp, but I recall some from > people managing socket pools or message queue pools). > > I know I flip/flopped on this - initially thinking it was a > reasonable expectation for factories to produce equals-discernible > instances. But I am afraid it will be the source of hard to find > bugs and I can see the argument on the other side - i.e., a pool > should be perfectly happy managing indiscernible objects. I never > agreed with Leibniz [1] - he he. > > As you pointed out, Mark, 1.x gets around this problem by wrapping / > unwrapping but that strategy makes identity-checking awkward. I > will think about this some more, but a hybrid strategy where > _allObjects becomes a HashBag (steal from [collections]) used for > quick checking and we add _allInstances to hold wrapped instances. > Could be nonsense, but something along these lines might work. I've done some testing and under heavy load (10 threads concurrently creating 100 objects each) on a 8 core machine I saw zero duplicates. With this in mind I think I can do the following for pool 2: - have calls to PoolableObjectFactory#makeObject() check System#idenityHashcode() and if a duplicate is detected discard that object and ask the factory for a new one; - use some simple wrapper based on identityHashcode around objects in the idle object pool and the all objects map - when an object is returned, use identityHashcode to map it back to the right object in the all object pool The requirement this places on PoolableObjectFactories is that they must produce objects with identityHashcodes that only rarely have collisions. Since this is under JVM control and my test case shows this is rare, I'm happy with this. > The important thing at this point is to agree on what invariants we > expect [pool] and user factories to maintain. Since I have already > changed my mind once on this, and I understand the practical > arguments in favor of staying with the setup now in trunk (equal > instances not allowed), I would like to hear what others have to say > on this. I think we can have it both ways this time :) I'll leave this a little while before I do any coding to give folks a chance to comment. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 6/9/11 9:05 AM, Mark Thomas wrote: > On 09/06/2011 15:48, Gary Gregory wrote: >> Hi All: >> >> I would like to understand the requirements better: >> >> - Is this for pool1 and/or pool2? It seems like a big change for pool1 that >> should be in a 1.6 (not 1.5.x) > pool2. No plans for this change in pool1. > >> - Do we have real user stories for this new req? Or is this a theoretical >> nicety? > The most obvious one is preventing the same object being returned to the > pool more than once. POOL-103. > > Keeping track of all the objects in the pool opens up a range of > possibilities for improved monitoring/management features. We could, for > example, move the abandoned object tracking from DBCP to pool. I guess I "want it both ways" here. We already have a use case for tracking instances - abandoned connection tracking in DBCP and *lots* of requests and practical uses for holding references to instances in circulation (mostly from dbcp, but I recall some from people managing socket pools or message queue pools). I know I flip/flopped on this - initially thinking it was a reasonable expectation for factories to produce equals-discernible instances. But I am afraid it will be the source of hard to find bugs and I can see the argument on the other side - i.e., a pool should be perfectly happy managing indiscernible objects. I never agreed with Leibniz [1] - he he. As you pointed out, Mark, 1.x gets around this problem by wrapping / unwrapping but that strategy makes identity-checking awkward. I will think about this some more, but a hybrid strategy where _allObjects becomes a HashBag (steal from [collections]) used for quick checking and we add _allInstances to hold wrapped instances. Could be nonsense, but something along these lines might work. The important thing at this point is to agree on what invariants we expect [pool] and user factories to maintain. Since I have already changed my mind once on this, and I understand the practical arguments in favor of staying with the setup now in trunk (equal instances not allowed), I would like to hear what others have to say on this. Phil [1] http://en.wikipedia.org/wiki/Identity_of_indiscernibles > Mark > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 09/06/2011 15:48, Gary Gregory wrote: > Hi All: > > I would like to understand the requirements better: > > - Is this for pool1 and/or pool2? It seems like a big change for pool1 that > should be in a 1.6 (not 1.5.x) pool2. No plans for this change in pool1. > - Do we have real user stories for this new req? Or is this a theoretical > nicety? The most obvious one is preventing the same object being returned to the pool more than once. POOL-103. Keeping track of all the objects in the pool opens up a range of possibilities for improved monitoring/management features. We could, for example, move the abandoned object tracking from DBCP to pool. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 09/06/2011 14:50, sebb wrote: > On 9 June 2011 14:41, Mark Thomas wrote: >> On 09/06/2011 10:01, Julien Aymé wrote: >>> 2011/6/9 Mark Thomas : On 09/06/2011 04:39, Phil Steitz wrote: > Code in trunk now does not work when distinct pooled instances are > equal - i.e., if a factory produces instances A and B and > A.equals(B), this causes problems. I think this situation should > be allowed - i.e. it is an unacceptable restriction to put on object > factories that distinct the poolable objects they produce be > distinguishable under equals. This would be a new requirement for > [pool] and I don't think we should require it. What do others think? As I start to answer this, I can see a very long response developing. I will do my best to keep it short. That may mean I gloss over some aspects. The requirement that objects obtained from the factories meet A.equals(B) == false greatly simplifies the implementation of a number of requirements. Let me explain by using a single requirement although there are a number of other requirements that have very similar consequences. The Requirement: It shall not be possible to return an object to the pool more than once. The pool maintains a list of idle objects. The simplest implementation of the above requirement is to test if any returned object already exists in the pool. This doesn't catch all scenarios but it is a start. If we know that for objects obtained from the factories A.equals(B) == false then we can use a HashSet to store idle instances and it is very easy to determine if the object being returned exists in the set of idle objects. This makes determining if the object is being returned twice relatively inexpensive. It also makes a reasonable multi-threaded implementation possible. >>> >>> >>> >>> And what about using an IdentityHashSet (or IdentityHashMap) to store >>> idle objects. >>> This would meet the Requirement without having to enforce A.equals(B) == >>> false. >> >> That would be one of the aspects I glossed over. They aren't always >> maps/sets and they need to support concurrent access by multiple threads. >> >> A wrapper for pooled objects that uses System.identityHashCode(Object) >> may be a possible solution that isn't too complex. It would add a >> requirement for the pool to unwrap/wrap objects on borrow/return. I can >> look at this if folks think the new restriction on factories is >> unacceptable. > > Note that System.identityHashCode() is not necessarily unique: > https://issues.apache.org/jira/browse/LANG-459 I think I might be able to work around that by asking the factory for another object if it issues one with an identityHashCode we have already seen. > Could of course use == to disambiguate such objects. That would have some nasty performance implications and may also needs some syncs in a few places. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
Hi All: I would like to understand the requirements better: - Is this for pool1 and/or pool2? It seems like a big change for pool1 that should be in a 1.6 (not 1.5.x) - Do we have real user stories for this new req? Or is this a theoretical nicety? Thank you, Gary On Thu, Jun 9, 2011 at 9:50 AM, sebb wrote: > On 9 June 2011 14:41, Mark Thomas wrote: > > On 09/06/2011 10:01, Julien Aymé wrote: > >> 2011/6/9 Mark Thomas : > >>> On 09/06/2011 04:39, Phil Steitz wrote: > Code in trunk now does not work when distinct pooled instances are > equal - i.e., if a factory produces instances A and B and > A.equals(B), this causes problems. I think this situation should > be allowed - i.e. it is an unacceptable restriction to put on object > factories that distinct the poolable objects they produce be > distinguishable under equals. This would be a new requirement for > [pool] and I don't think we should require it. What do others think? > >>> > >>> As I start to answer this, I can see a very long response developing. I > >>> will do my best to keep it short. That may mean I gloss over some > aspects. > >>> > >>> The requirement that objects obtained from the factories meet > >>> A.equals(B) == false greatly simplifies the implementation of a number > >>> of requirements. Let me explain by using a single requirement although > >>> there are a number of other requirements that have very similar > >>> consequences. > >>> > >>> The Requirement: > >>> It shall not be possible to return an object to the pool more than > once. > >>> > >>> The pool maintains a list of idle objects. The simplest implementation > >>> of the above requirement is to test if any returned object already > >>> exists in the pool. This doesn't catch all scenarios but it is a start. > >>> > >>> If we know that for objects obtained from the factories A.equals(B) == > >>> false then we can use a HashSet to store idle instances and it is very > >>> easy to determine if the object being returned exists in the set of > idle > >>> objects. This makes determining if the object is being returned twice > >>> relatively inexpensive. It also makes a reasonable multi-threaded > >>> implementation possible. > >> > >> > >> > >> And what about using an IdentityHashSet (or IdentityHashMap) to store > >> idle objects. > >> This would meet the Requirement without having to enforce A.equals(B) == > false. > > > > That would be one of the aspects I glossed over. They aren't always > > maps/sets and they need to support concurrent access by multiple threads. > > > > A wrapper for pooled objects that uses System.identityHashCode(Object) > > may be a possible solution that isn't too complex. It would add a > > requirement for the pool to unwrap/wrap objects on borrow/return. I can > > look at this if folks think the new restriction on factories is > > unacceptable. > > Note that System.identityHashCode() is not necessarily unique: > https://issues.apache.org/jira/browse/LANG-459 > > Could of course use == to disambiguate such objects. > > > Mark > > > > - > > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > > For additional commands, e-mail: dev-h...@commons.apache.org > > > > > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > -- Thank you, Gary http://garygregory.wordpress.com/ http://garygregory.com/ http://people.apache.org/~ggregory/ http://twitter.com/GaryGregory
Re: [pool] equal instances
On 9 June 2011 14:41, Mark Thomas wrote: > On 09/06/2011 10:01, Julien Aymé wrote: >> 2011/6/9 Mark Thomas : >>> On 09/06/2011 04:39, Phil Steitz wrote: Code in trunk now does not work when distinct pooled instances are equal - i.e., if a factory produces instances A and B and A.equals(B), this causes problems. I think this situation should be allowed - i.e. it is an unacceptable restriction to put on object factories that distinct the poolable objects they produce be distinguishable under equals. This would be a new requirement for [pool] and I don't think we should require it. What do others think? >>> >>> As I start to answer this, I can see a very long response developing. I >>> will do my best to keep it short. That may mean I gloss over some aspects. >>> >>> The requirement that objects obtained from the factories meet >>> A.equals(B) == false greatly simplifies the implementation of a number >>> of requirements. Let me explain by using a single requirement although >>> there are a number of other requirements that have very similar >>> consequences. >>> >>> The Requirement: >>> It shall not be possible to return an object to the pool more than once. >>> >>> The pool maintains a list of idle objects. The simplest implementation >>> of the above requirement is to test if any returned object already >>> exists in the pool. This doesn't catch all scenarios but it is a start. >>> >>> If we know that for objects obtained from the factories A.equals(B) == >>> false then we can use a HashSet to store idle instances and it is very >>> easy to determine if the object being returned exists in the set of idle >>> objects. This makes determining if the object is being returned twice >>> relatively inexpensive. It also makes a reasonable multi-threaded >>> implementation possible. >> >> >> >> And what about using an IdentityHashSet (or IdentityHashMap) to store >> idle objects. >> This would meet the Requirement without having to enforce A.equals(B) == >> false. > > That would be one of the aspects I glossed over. They aren't always > maps/sets and they need to support concurrent access by multiple threads. > > A wrapper for pooled objects that uses System.identityHashCode(Object) > may be a possible solution that isn't too complex. It would add a > requirement for the pool to unwrap/wrap objects on borrow/return. I can > look at this if folks think the new restriction on factories is > unacceptable. Note that System.identityHashCode() is not necessarily unique: https://issues.apache.org/jira/browse/LANG-459 Could of course use == to disambiguate such objects. > Mark > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 09/06/2011 10:01, Julien Aymé wrote: > 2011/6/9 Mark Thomas : >> On 09/06/2011 04:39, Phil Steitz wrote: >>> Code in trunk now does not work when distinct pooled instances are >>> equal - i.e., if a factory produces instances A and B and >>> A.equals(B), this causes problems. I think this situation should >>> be allowed - i.e. it is an unacceptable restriction to put on object >>> factories that distinct the poolable objects they produce be >>> distinguishable under equals. This would be a new requirement for >>> [pool] and I don't think we should require it. What do others think? >> >> As I start to answer this, I can see a very long response developing. I >> will do my best to keep it short. That may mean I gloss over some aspects. >> >> The requirement that objects obtained from the factories meet >> A.equals(B) == false greatly simplifies the implementation of a number >> of requirements. Let me explain by using a single requirement although >> there are a number of other requirements that have very similar >> consequences. >> >> The Requirement: >> It shall not be possible to return an object to the pool more than once. >> >> The pool maintains a list of idle objects. The simplest implementation >> of the above requirement is to test if any returned object already >> exists in the pool. This doesn't catch all scenarios but it is a start. >> >> If we know that for objects obtained from the factories A.equals(B) == >> false then we can use a HashSet to store idle instances and it is very >> easy to determine if the object being returned exists in the set of idle >> objects. This makes determining if the object is being returned twice >> relatively inexpensive. It also makes a reasonable multi-threaded >> implementation possible. > > > > And what about using an IdentityHashSet (or IdentityHashMap) to store > idle objects. > This would meet the Requirement without having to enforce A.equals(B) == > false. That would be one of the aspects I glossed over. They aren't always maps/sets and they need to support concurrent access by multiple threads. A wrapper for pooled objects that uses System.identityHashCode(Object) may be a possible solution that isn't too complex. It would add a requirement for the pool to unwrap/wrap objects on borrow/return. I can look at this if folks think the new restriction on factories is unacceptable. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
2011/6/9 Mark Thomas : > On 09/06/2011 04:39, Phil Steitz wrote: >> Code in trunk now does not work when distinct pooled instances are >> equal - i.e., if a factory produces instances A and B and >> A.equals(B), this causes problems. I think this situation should >> be allowed - i.e. it is an unacceptable restriction to put on object >> factories that distinct the poolable objects they produce be >> distinguishable under equals. This would be a new requirement for >> [pool] and I don't think we should require it. What do others think? > > As I start to answer this, I can see a very long response developing. I > will do my best to keep it short. That may mean I gloss over some aspects. > > The requirement that objects obtained from the factories meet > A.equals(B) == false greatly simplifies the implementation of a number > of requirements. Let me explain by using a single requirement although > there are a number of other requirements that have very similar > consequences. > > The Requirement: > It shall not be possible to return an object to the pool more than once. > > The pool maintains a list of idle objects. The simplest implementation > of the above requirement is to test if any returned object already > exists in the pool. This doesn't catch all scenarios but it is a start. > > If we know that for objects obtained from the factories A.equals(B) == > false then we can use a HashSet to store idle instances and it is very > easy to determine if the object being returned exists in the set of idle > objects. This makes determining if the object is being returned twice > relatively inexpensive. It also makes a reasonable multi-threaded > implementation possible. And what about using an IdentityHashSet (or IdentityHashMap) to store idle objects. This would meet the Requirement without having to enforce A.equals(B) == false. WDYT? Julien > > If we have to rely on testing A!=B then we have no choice but to iterate > through the idle objects. This is slow and would probably require at > least a small sync. In the multi-threaded case this is going to be > really slow. > > An alternative solution to handle factories where A.equals(B) == true > would be to wrap the object in another object where A.equals(B) == > false. This would make usage of the pool by clients more awkward as they > would need to unwrap the object. > > We can't add an intermediate layer that maps objects to wrappers since > that layer would need to maintain a mapping of objects to wrappers and > the only efficient way to do that is with a HashMap which requires > A.equals(B) == false. > > For the sort of functionality we want to have in pool2, I believe we > need to be able to use HashMap and derivatives where the objects are > used as keys. While implementations of some of the features may be > possible without this, I don't see a way to implement them without code > that is significantly more complex than pool1. One of the big advantages > of the pool2 code is that it is - in my view - much easier to understand > what is going on. I suggest that folks look at the current pool2 > implementation and try and figure out how to replace all cases where > pooled objects are used as keys in HashMaps etc. I think it would be > very difficult / impossible. > > Mark > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 09/06/2011 04:39, Phil Steitz wrote: > Code in trunk now does not work when distinct pooled instances are > equal - i.e., if a factory produces instances A and B and > A.equals(B), this causes problems. I think this situation should > be allowed - i.e. it is an unacceptable restriction to put on object > factories that distinct the poolable objects they produce be > distinguishable under equals. This would be a new requirement for > [pool] and I don't think we should require it. What do others think? As I start to answer this, I can see a very long response developing. I will do my best to keep it short. That may mean I gloss over some aspects. The requirement that objects obtained from the factories meet A.equals(B) == false greatly simplifies the implementation of a number of requirements. Let me explain by using a single requirement although there are a number of other requirements that have very similar consequences. The Requirement: It shall not be possible to return an object to the pool more than once. The pool maintains a list of idle objects. The simplest implementation of the above requirement is to test if any returned object already exists in the pool. This doesn't catch all scenarios but it is a start. If we know that for objects obtained from the factories A.equals(B) == false then we can use a HashSet to store idle instances and it is very easy to determine if the object being returned exists in the set of idle objects. This makes determining if the object is being returned twice relatively inexpensive. It also makes a reasonable multi-threaded implementation possible. If we have to rely on testing A!=B then we have no choice but to iterate through the idle objects. This is slow and would probably require at least a small sync. In the multi-threaded case this is going to be really slow. An alternative solution to handle factories where A.equals(B) == true would be to wrap the object in another object where A.equals(B) == false. This would make usage of the pool by clients more awkward as they would need to unwrap the object. We can't add an intermediate layer that maps objects to wrappers since that layer would need to maintain a mapping of objects to wrappers and the only efficient way to do that is with a HashMap which requires A.equals(B) == false. For the sort of functionality we want to have in pool2, I believe we need to be able to use HashMap and derivatives where the objects are used as keys. While implementations of some of the features may be possible without this, I don't see a way to implement them without code that is significantly more complex than pool1. One of the big advantages of the pool2 code is that it is - in my view - much easier to understand what is going on. I suggest that folks look at the current pool2 implementation and try and figure out how to replace all cases where pooled objects are used as keys in HashMaps etc. I think it would be very difficult / impossible. Mark - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
Re: [pool] equal instances
On 9 June 2011 04:39, Phil Steitz wrote: > Code in trunk now does not work when distinct pooled instances are > equal - i.e., if a factory produces instances A and B and > A.equals(B), this causes problems. I think this situation should > be allowed - i.e. it is an unacceptable restriction to put on object > factories that distinct the poolable objects they produce be > distinguishable under equals. This would be a new requirement for > [pool] and I don't think we should require it. What do others think? I agree one should not require instances to be unequal. That means Pool has to be a bit careful about how it uses hashes - presumably these objects will have the same hashCodes too. > Phil > > - > To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org > For additional commands, e-mail: dev-h...@commons.apache.org > > - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org
[pool] equal instances
Code in trunk now does not work when distinct pooled instances are equal - i.e., if a factory produces instances A and B and A.equals(B), this causes problems. I think this situation should be allowed - i.e. it is an unacceptable restriction to put on object factories that distinct the poolable objects they produce be distinguishable under equals. This would be a new requirement for [pool] and I don't think we should require it. What do others think? Phil - To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org For additional commands, e-mail: dev-h...@commons.apache.org