Re: [infinispan-dev] default value for virtualNodes

2012-01-30 Thread Dan Berindei
Manik, I'm assigning ISPN-1801 to myself - I need to add my key
distribution test and the results anyway.

Cheers
Dan


On Mon, Jan 30, 2012 at 1:17 PM, Manik Surtani ma...@jboss.org wrote:

 On 29 Jan 2012, at 14:57, Galder Zamarreño wrote:

 To reiterate what I said in another thread, the memory effects of virtual
 nodes on on Hot Rod clients is none since version 1.1 of the protocol
 (included in 5.1).

 I enhanced the protocol so that clients would generate virtual node hashes
 and so avoid sending them over the wire.


 Yes, I remember.  Perfect.  :)

 So for https://issues.jboss.org/browse/ISPN-1801 I'll set the default to 48.
  Any objections?

 Cheers
 Manik

 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani

 Lead, Infinispan
 http://www.infinispan.org




 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-29 Thread Galder Zamarreño
To reiterate what I said in another thread, the memory effects of virtual nodes 
on on Hot Rod clients is none since version 1.1 of the protocol (included in 
5.1). 

I enhanced the protocol so that clients would generate virtual node hashes and 
so avoid sending them over the wire.

Cheers,

On Jan 27, 2012, at 4:19 PM, Manik Surtani wrote:

 Good stuff!  Thanks for this.  Yes, I'm ok with numVirtualNodes=48 as a 
 default.  Galder, your thoughts from a Hot Rod perspective?
 
 On 27 Jan 2012, at 08:41, Dan Berindei wrote:
 
 Hi guys
 
 I've been working on a test to search for an optimal default value here:
 https://github.com/danberindei/infinispan/commit/983c0328dc40be9609fcabb767dd46f9b98af464
 
 I'm measuring both the number of keys for which a node is primary
 owner and the number of keys for which it is one of the owners
 compared to the ideal distribution (K/N keys on each node). The former
 tells us how much more work the node could be expected to do, the
 latter how much memory the node is likely to need.
 
 I'm only running 1 loops, so the max figure is not the absolute
 maximum. But it's certainly bigger than the 0. percentile.
 
 The full results are here:
 http://fpaste.org/cI1r/
 
 The uniformity of the distribution goes up with the number of virtual
 nodes but down with the number of physical nodes. I think we should go
 with a default of 48 nodes (or 50 if you prefer decimal). With 32
 nodes, there's only a 0.1% chance that a node will hold more than 1.35
 * K/N keys, and a 0.1% chance that the node will be primary owner for
 more than 1.5 * K/N keys.
 
 We could go higher, but we run against the risk of node addresses
 colliding on the hash wheel. According to the formula on the Birthday
 Paradox page (http://en.wikipedia.org/wiki/Birthday_problem), we only
 need 2072 addresses on our 2^31 hash wheel to get a 0.1% chance of
 collision. That means 21 nodes * 96 virtual nodes, 32 nodes * 64
 virtual nodes or 43 nodes * 48 virtual nodes.
 
 Cheers
 Dan
 
 
 On Fri, Jan 27, 2012 at 12:37 AM, Sanne Grinovero sa...@infinispan.org 
 wrote:
 On 26 January 2012 22:29, Manik Surtani ma...@jboss.org wrote:
 
 On 26 Jan 2012, at 20:16, Sanne Grinovero wrote:
 
 +1
 Which default? 100? A prime?
 
 We should also make sure the CH function is optimized for this being on.
 
 
 Yes, we should profile a session with vnodes enabled.
 
 Manik, we're using VNodes in our performance tests. The proposal is if
 we can provide a good default value, as the feature is currently
 disabled by default.
 
 Cheers,
 Sanne
 
 
 
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani
 
 Lead, Infinispan
 http://www.infinispan.org
 
 
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani
 
 Lead, Infinispan
 http://www.infinispan.org
 
 
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Galder Zamarreño
Sr. Software Engineer
Infinispan, JBoss Cache


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Manik Surtani
Good stuff!  Thanks for this.  Yes, I'm ok with numVirtualNodes=48 as a 
default.  Galder, your thoughts from a Hot Rod perspective?

On 27 Jan 2012, at 08:41, Dan Berindei wrote:

 Hi guys
 
 I've been working on a test to search for an optimal default value here:
 https://github.com/danberindei/infinispan/commit/983c0328dc40be9609fcabb767dd46f9b98af464
 
 I'm measuring both the number of keys for which a node is primary
 owner and the number of keys for which it is one of the owners
 compared to the ideal distribution (K/N keys on each node). The former
 tells us how much more work the node could be expected to do, the
 latter how much memory the node is likely to need.
 
 I'm only running 1 loops, so the max figure is not the absolute
 maximum. But it's certainly bigger than the 0. percentile.
 
 The full results are here:
 http://fpaste.org/cI1r/
 
 The uniformity of the distribution goes up with the number of virtual
 nodes but down with the number of physical nodes. I think we should go
 with a default of 48 nodes (or 50 if you prefer decimal). With 32
 nodes, there's only a 0.1% chance that a node will hold more than 1.35
 * K/N keys, and a 0.1% chance that the node will be primary owner for
 more than 1.5 * K/N keys.
 
 We could go higher, but we run against the risk of node addresses
 colliding on the hash wheel. According to the formula on the Birthday
 Paradox page (http://en.wikipedia.org/wiki/Birthday_problem), we only
 need 2072 addresses on our 2^31 hash wheel to get a 0.1% chance of
 collision. That means 21 nodes * 96 virtual nodes, 32 nodes * 64
 virtual nodes or 43 nodes * 48 virtual nodes.
 
 Cheers
 Dan
 
 
 On Fri, Jan 27, 2012 at 12:37 AM, Sanne Grinovero sa...@infinispan.org 
 wrote:
 On 26 January 2012 22:29, Manik Surtani ma...@jboss.org wrote:
 
 On 26 Jan 2012, at 20:16, Sanne Grinovero wrote:
 
 +1
 Which default? 100? A prime?
 
 We should also make sure the CH function is optimized for this being on.
 
 
 Yes, we should profile a session with vnodes enabled.
 
 Manik, we're using VNodes in our performance tests. The proposal is if
 we can provide a good default value, as the feature is currently
 disabled by default.
 
 Cheers,
 Sanne
 
 
 
 --
 Manik Surtani
 ma...@jboss.org
 twitter.com/maniksurtani
 
 Lead, Infinispan
 http://www.infinispan.org
 
 
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Bela Ban
I assume the number of vnodes cannot be changed at runtime, dynamically 
adapting to a changing environment ?

I understand everybody has to have the exact same number of vnodes for 
reads and writes to hit the correct node, right ?

On 1/27/12 9:41 AM, Dan Berindei wrote:
 Hi guys

 I've been working on a test to search for an optimal default value here:
 https://github.com/danberindei/infinispan/commit/983c0328dc40be9609fcabb767dd46f9b98af464

 I'm measuring both the number of keys for which a node is primary
 owner and the number of keys for which it is one of the owners
 compared to the ideal distribution (K/N keys on each node). The former
 tells us how much more work the node could be expected to do, the
 latter how much memory the node is likely to need.

 I'm only running 1 loops, so the max figure is not the absolute
 maximum. But it's certainly bigger than the 0. percentile.

 The full results are here:
 http://fpaste.org/cI1r/

 The uniformity of the distribution goes up with the number of virtual
 nodes but down with the number of physical nodes. I think we should go
 with a default of 48 nodes (or 50 if you prefer decimal). With 32
 nodes, there's only a 0.1% chance that a node will hold more than 1.35
 * K/N keys, and a 0.1% chance that the node will be primary owner for
 more than 1.5 * K/N keys.

 We could go higher, but we run against the risk of node addresses
 colliding on the hash wheel. According to the formula on the Birthday
 Paradox page (http://en.wikipedia.org/wiki/Birthday_problem), we only
 need 2072 addresses on our 2^31 hash wheel to get a 0.1% chance of
 collision. That means 21 nodes * 96 virtual nodes, 32 nodes * 64
 virtual nodes or 43 nodes * 48 virtual nodes.

 Cheers
 Dan


 On Fri, Jan 27, 2012 at 12:37 AM, Sanne Grinoverosa...@infinispan.org  
 wrote:
 On 26 January 2012 22:29, Manik Surtanima...@jboss.org  wrote:

 On 26 Jan 2012, at 20:16, Sanne Grinovero wrote:

 +1
 Which default? 100? A prime?

 We should also make sure the CH function is optimized for this being on.


 Yes, we should profile a session with vnodes enabled.

 Manik, we're using VNodes in our performance tests. The proposal is if
 we can provide a good default value, as the feature is currently
 disabled by default.


-- 
Bela Ban
Lead JGroups (http://www.jgroups.org)
JBoss / Red Hat
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Manik Surtani
On 27 Jan 2012, at 10:52, Bela Ban wrote:

 I assume the number of vnodes cannot be changed at runtime, dynamically 
 adapting to a changing environment ?
 
 I understand everybody has to have the exact same number of vnodes for 
 reads and writes to hit the correct node, right ?

Yes.

--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org



___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Mircea Markus
I've created a JIRA to track this: https://issues.jboss.org/browse/ISPN-1801

 I understand everybody has to have the exact same number of vnodes for 
 reads and writes to hit the correct node, right ?
 Yes.

That's true, but it is not a good thing: numVirtNodes should be proportional 
with the node's capacity, i.e. more powerful machines in the cluster should 
have assigned more virtual nodes.
This way we can better control the load. A node would need to send its 
configured numVirtualNodes when joining in order to support this, but that's a 
thing we already do for  TACH.___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Dan Berindei
On Fri, Jan 27, 2012 at 2:35 PM, Mircea Markus mircea.mar...@jboss.com wrote:
 I've created a JIRA to track this: https://issues.jboss.org/browse/ISPN-1801

 I understand everybody has to have the exact same number of vnodes for

 reads and writes to hit the correct node, right ?

 Yes.

 That's true, but it is not a good thing: numVirtNodes should be proportional
 with the node's capacity, i.e. more powerful machines in the cluster should
 have assigned more virtual nodes.
 This way we can better control the load. A node would need to send its
 configured numVirtualNodes when joining in order to support this, but that's
 a thing we already do for  TACH.


We should use a different mechanism than the TopologyAwareUUID we use
for TACH, because the address is sent with every command. The capacity
instead should be fairly static. We may want to make it changeable at
runtime, but it will take a state transfer to propagate that info to
all the members of the cluster (because the nodes' CHs need to stay in
sync).

In fact, I can imagine users wanting to balance key ownership between
machines/racks/sites with TACH, but without actually using RELAY - the
TopologyAwareUUID is just an overhead for them.

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-27 Thread Dan Berindei
On Fri, Jan 27, 2012 at 2:53 PM, Mircea Markus mircea.mar...@jboss.com wrote:
 That's true, but it is not a good thing: numVirtNodes should be proportional
 with the node's capacity, i.e. more powerful machines in the cluster should
 have assigned more virtual nodes.
 This way we can better control the load. A node would need to send its
 configured numVirtualNodes when joining in order to support this, but that's
 a thing we already do for  TACH.


 We should use a different mechanism than the TopologyAwareUUID we use
 for TACH, because the address is sent with every command.
 so every command sends cluster, rack and machine info? That's sounds a bit 
 redundant. Can't we just send them once with the JOIN request?

When RELAY is enabled, it actually needs the topology info in order to
relay messages between sites.
I agree that the topology info should never change, but RELAY requires
it for now so we can't avoid it (in the general case).

Cheers
Dan

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] default value for virtualNodes

2012-01-26 Thread Sanne Grinovero
+1
Which default? 100? A prime?

We should also make sure the CH function is optimized for this being on.
 On Jan 26, 2012 8:12 PM, Pete Muir pm...@redhat.com wrote:

 I think if we are confident it will benefit all, we should turn it on.

 On 26 Jan 2012, at 18:54, Mircea Markus wrote:

  Hi,
 
  ATM the default value for virtualNodes is 1. This means that the
 wheel-share each node has can be very uneven[1] for smalls(up to 15 nodes)
 clusters.
  Increasing this value even to a small number(10-30) would significantly
 improve each node's share of wheel and the chance for a well balanced data
 distribution over the cluster.
  So I think that increasing the default value would make sense. What are
 the drawbacks though? I'm thinking performance and HR wise...
 
 
  [1] a random example of uneven distribution obtained with radargun
 
  Cluster size: 4 - ( 15505 13698 5918 4482)
  Cluster size: 6 - ( 8761 7820 17145 8188 12827 4183)
  Cluster size: 8 - ( 8391 6302 10773 22068 3589 200 3050 25211)
 
 
 
  ___
  infinispan-dev mailing list
  infinispan-dev@lists.jboss.org
  https://lists.jboss.org/mailman/listinfo/infinispan-dev


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] default value for virtualNodes

2012-01-26 Thread Manik Surtani

On 26 Jan 2012, at 20:16, Sanne Grinovero wrote:

 +1
 Which default? 100? A prime?
 
 We should also make sure the CH function is optimized for this being on.
 

Yes, we should profile a session with vnodes enabled.


--
Manik Surtani
ma...@jboss.org
twitter.com/maniksurtani

Lead, Infinispan
http://www.infinispan.org



___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev