Re: [infinispan-dev] Separate ExecutorService for map/reduce tasks?

2012-12-05 Thread Tristan Tarrant
In 6.0 I would really like to go away from the current executor 
configuration (e.g. a specific element for every executor) and allow the 
creation of named executors (this is how the AS configuration works).

Tristan

On 11/27/2012 09:07 PM, Vladimir Blagojevic wrote:
 Hi,

 Although https://issues.jboss.org/browse/ISPN-2284 is charted for 6.0 I
 would like to see if there is a possibility to finish it for 5.2. Most
 of the parallel execution I have done already this and last week [1].
 However, this change is not limited to map/reduce package only as we
 might possibly want to have a separate executor for map/reduce execution
 on each node. These changes affect global configuration and are not
 confined to map/reduce packages only. Or should we simply use transport
 executor for execution of these tasks for now and should the need arise
 introduce separate executor in the future releases?

 Regards,
 Vladimir


 [1] https://github.com/vblagoje/infinispan/tree/t_2284


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev



___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Separate ExecutorService for map/reduce tasks?

2012-12-05 Thread Mircea Markus

On 5 Dec 2012, at 08:36, Tristan Tarrant wrote:

 In 6.0 I would really like to go away from the current executor 
 configuration (e.g. a specific element for every executor) and allow the 
 creation of named executors (this is how the AS configuration works).

So that you can refer to the same executor from multiple places?

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Galder Zamarreño

On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org wrote:

 On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com wrote:
 Hey Dan/Adrian,
 
 Re: https://issues.jboss.org/browse/ISPN-2541
 
 I'm looking at this intermittent failure, and it seems to be caused by the 
 fact that the test does not wait for the cluster to be formed when the new 
 node is started, which can lead a replication timeout failure from the new 
 joining node.
 
 The test can easily be fixed by waiting for cluster to form, and then do the 
 call.
 
 [...]
 
 I don't think the cache should ever be in an illegal state to be used
 after being started. So Infinispan should not require tests to wait
 for a cluster to be formed, I'd rather guarantee that after a cache
 is started it's usable.

Precisely, which is why I raised the flag instead of going down the easy path.

 
 If this is not possible, then any application would also need to wait
 for that cluster formed event, and we should expose an API for that.

The problem is considering when a cluster is formed. How many nodes should you 
wait for?

There's already plans for something similar:
https://issues.jboss.org/browse/ISPN-928

 I'd prefer the getCache() to block for long enough.
 
 Sanne
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
gal...@redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Galder Zamarreño

On Dec 4, 2012, at 11:52 AM, Bela Ban b...@redhat.com wrote:

 
 
 On 12/4/12 11:30 AM, Dan Berindei wrote:
 BTW, I also got an exception yesterday in MarshallExternalPojosTest and
 I investigated it, but in my case the error was much weirder: two nodes
 both opened a TCP connection to each other, yet none of them received
 the forwarded command. I've asked Bela to investigate as well, but he
 didn't find anything suspicious in JGroups.
 
 If a node A connects to B and B connects to A at the exact same time 
 (and there wasn't any existing connection between the 2 nodes, then one 
 of the 2 will 'win' and the other one will close its connection. The 
 message to be sent is then lost.
 
 This is corrected by one of the upper layers, e.g. UNICAST retransmits 
 the message until it gets an ack. Re-sending a message will then create 
 a new connection, if the existing one was closed / removed.
 
 However, with UNICAST2, if a given message was the last message and no 
 further messages are sent, then only UNICAST2's stability messages will 
 detect that the other node is missing the last message sent. Stability 
 is triggered every 60 seconds by default, so unless that property was 
 changed, or stability was triggered programmatically, that last (lost) 
 message won't get retransmitted for 60 seconds.

^ Isn't that a default too high?

Seems to me the scenario explained could happen relatively easily if two nodes 
are started simultaneously.

We no longer ask users to stagger their startups?

 
 -- 
 Bela Ban, JGroups lead (http://www.jgroups.org)
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
gal...@redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Bela Ban

On 12/5/12 12:06 PM, Galder Zamarreño wrote:
 On Dec 4, 2012, at 11:52 AM, Bela Ban b...@redhat.com wrote:

 If a node A connects to B and B connects to A at the exact same time
 (and there wasn't any existing connection between the 2 nodes, then one
 of the 2 will 'win' and the other one will close its connection. The
 message to be sent is then lost.

 This is corrected by one of the upper layers, e.g. UNICAST retransmits
 the message until it gets an ack. Re-sending a message will then create
 a new connection, if the existing one was closed / removed.

 However, with UNICAST2, if a given message was the last message and no
 further messages are sent, then only UNICAST2's stability messages will
 detect that the other node is missing the last message sent. Stability
 is triggered every 60 seconds by default, so unless that property was
 changed, or stability was triggered programmatically, that last (lost)
 message won't get retransmitted for 60 seconds.
 ^ Isn't that a default too high?

Assuming that the size-based stable messages are the norm, then 
time-based only kicks in as a second line of defense. With 
https://issues.jboss.org/browse/JGRP-1548 in place, this becomes even 
less important, so I want to leave it high as it does generate some 
traffic when set too small.


 Seems to me the scenario explained could happen relatively easily if two 
 nodes are started simultaneously.

No, the startup won't trigger concurrent connections, as only the joiner 
connects to the coordinator and the coordinator reuses the same 
connection to send the JOIN-RSP back.

It is the rebalancing process that triggers this in certain cases; the 
forwarding of state transfer requests to different owners.

 We no longer ask users to stagger their startups?

Because concurrent startup works, and not having to stagger startup is 
simpler.

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Galder Zamarreño

On Dec 4, 2012, at 11:30 AM, Dan Berindei dan.berin...@gmail.com wrote:

 
 On Tue, Dec 4, 2012 at 11:32 AM, Mircea Markus mmar...@redhat.com wrote:
 
 On 4 Dec 2012, at 09:22, Sanne Grinovero wrote:
 
 [...]
 
 I don't think the cache should ever be in an illegal state to be used
 after being started. So Infinispan should not require tests to wait
 for a cluster to be formed, I'd rather guarantee that after a cache
 is started it's usable.
 +1. Unless the test relies/verifies internal state, e.g. locks being 
 acquired, data present in the data container etc.
 
 
 It's not just a question of what you want to check, it's also a question of 
 what you don't want to check... I think in general a test should focus on a 
 specific issue, and we know state transfer is always a potential source of 
 (unrelated) failures. So I'd rather have tests that do test state transfer 
 and command forwarding, and tests that avoid state transfer and command 
 forwarding (by waiting for the cluster to form completely).
 
 I'm pretty sure this is another instance of ISPN-2473, and once we have a fix 
 (and a unit test) for this particular failure, MarshallExternalPojosTest 
 could very well wait for the cluster to form and ignore any state 
 transfer-related issues.
 
 BTW, I also got an exception yesterday in MarshallExternalPojosTest and I 
 investigated it, but in my case the error was much weirder: two nodes both 
 opened a TCP connection to each other, yet none of them received the 
 forwarded command. I've asked Bela to investigate as well, but he didn't find 
 anything suspicious in JGroups.

Ok, wrt ISPN-2541, I suggest holding off until all other known issues have been 
solved and see if the issue keeps appearing.

It seems to be a good test for catching these issues (indirectly), so it could 
be useful to verify :)

 
 
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
gal...@redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Sanne Grinovero
On 5 December 2012 11:02, Galder Zamarreño gal...@redhat.com wrote:

 On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org wrote:

 On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com wrote:
 Hey Dan/Adrian,

 Re: https://issues.jboss.org/browse/ISPN-2541

 I'm looking at this intermittent failure, and it seems to be caused by the 
 fact that the test does not wait for the cluster to be formed when the new 
 node is started, which can lead a replication timeout failure from the new 
 joining node.

 The test can easily be fixed by waiting for cluster to form, and then do 
 the call.

 [...]

 I don't think the cache should ever be in an illegal state to be used
 after being started. So Infinispan should not require tests to wait
 for a cluster to be formed, I'd rather guarantee that after a cache
 is started it's usable.

 Precisely, which is why I raised the flag instead of going down the easy path.


 If this is not possible, then any application would also need to wait
 for that cluster formed event, and we should expose an API for that.

 The problem is considering when a cluster is formed. How many nodes should 
 you wait for?


Why can't we rely on JGroups Discovery to know that, as a user I
already specified the expected initial group size with
num_initial_members
Don't want to repeat that configuration ;-)

Sanne

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Galder Zamarreño

On Dec 5, 2012, at 1:23 PM, Sanne Grinovero sa...@infinispan.org wrote:

 On 5 December 2012 11:02, Galder Zamarreño gal...@redhat.com wrote:
 
 On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org wrote:
 
 On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com wrote:
 Hey Dan/Adrian,
 
 Re: https://issues.jboss.org/browse/ISPN-2541
 
 I'm looking at this intermittent failure, and it seems to be caused by the 
 fact that the test does not wait for the cluster to be formed when the new 
 node is started, which can lead a replication timeout failure from the new 
 joining node.
 
 The test can easily be fixed by waiting for cluster to form, and then do 
 the call.
 
 [...]
 
 I don't think the cache should ever be in an illegal state to be used
 after being started. So Infinispan should not require tests to wait
 for a cluster to be formed, I'd rather guarantee that after a cache
 is started it's usable.
 
 Precisely, which is why I raised the flag instead of going down the easy 
 path.
 
 
 If this is not possible, then any application would also need to wait
 for that cluster formed event, and we should expose an API for that.
 
 The problem is considering when a cluster is formed. How many nodes should 
 you wait for?
 
 
 Why can't we rely on JGroups Discovery to know that, as a user I
 already specified the expected initial group size with
 num_initial_members
 Don't want to repeat that configuration ;-)

The num initial members is simply used to decide who's the coordinator and has 
no relationship with the number of nodes that are in the cluster.

I don't think it's the same thing, but could be reused...

 
 Sanne
 
 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev


--
Galder Zamarreño
gal...@redhat.com
twitter.com/galderz

Project Lead, Escalante
http://escalante.io

Engineer, Infinispan
http://infinispan.org


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Bela Ban

On 12/5/12 1:23 PM, Sanne Grinovero wrote:
 On 5 December 2012 11:02, Galder Zamarreño gal...@redhat.com wrote:
 On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org wrote:

 On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com wrote:
 Hey Dan/Adrian,

 Re: https://issues.jboss.org/browse/ISPN-2541

 I'm looking at this intermittent failure, and it seems to be caused by the 
 fact that the test does not wait for the cluster to be formed when the new 
 node is started, which can lead a replication timeout failure from the new 
 joining node.

 The test can easily be fixed by waiting for cluster to form, and then do 
 the call.

 [...]

 I don't think the cache should ever be in an illegal state to be used
 after being started. So Infinispan should not require tests to wait
 for a cluster to be formed, I'd rather guarantee that after a cache
 is started it's usable.
 Precisely, which is why I raised the flag instead of going down the easy 
 path.

 If this is not possible, then any application would also need to wait
 for that cluster formed event, and we should expose an API for that.
 The problem is considering when a cluster is formed. How many nodes should 
 you wait for?

 Why can't we rely on JGroups Discovery to know that, as a user I
 already specified the expected initial group size with
 num_initial_members
 Don't want to repeat that configuration ;-)


I don't understand this discussion: when a new node join, it'll return 
from JChannel.connect() when it received a JOIN response from the 
coordinator, with the current view... or are you guys talking about 
Infinispan's 'service views' ?

-- 
Bela Ban, JGroups lead (http://www.jgroups.org)

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Sanne Grinovero
On 5 December 2012 14:01, Bela Ban b...@redhat.com wrote:

 On 12/5/12 1:23 PM, Sanne Grinovero wrote:
 On 5 December 2012 11:02, Galder Zamarreño gal...@redhat.com wrote:
 On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org wrote:

 On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com wrote:
 Hey Dan/Adrian,

 Re: https://issues.jboss.org/browse/ISPN-2541

 I'm looking at this intermittent failure, and it seems to be caused by 
 the fact that the test does not wait for the cluster to be formed when 
 the new node is started, which can lead a replication timeout failure 
 from the new joining node.

 The test can easily be fixed by waiting for cluster to form, and then do 
 the call.

 [...]

 I don't think the cache should ever be in an illegal state to be used
 after being started. So Infinispan should not require tests to wait
 for a cluster to be formed, I'd rather guarantee that after a cache
 is started it's usable.
 Precisely, which is why I raised the flag instead of going down the easy 
 path.

 If this is not possible, then any application would also need to wait
 for that cluster formed event, and we should expose an API for that.
 The problem is considering when a cluster is formed. How many nodes should 
 you wait for?

 Why can't we rely on JGroups Discovery to know that, as a user I
 already specified the expected initial group size with
 num_initial_members
 Don't want to repeat that configuration ;-)


 I don't understand this discussion: when a new node join, it'll return
 from JChannel.connect() when it received a JOIN response from the
 coordinator, with the current view... or are you guys talking about
 Infinispan's 'service views' ?

+1

That's why I'm confused too, and not understanding how it is possible
that a Cache is returned to the application - which doesn't have a
clue about number of expected nodes - in a state for which the
cluster is not formed yet. That should never happen!?

I never understood why the test framework in Infinispan requires this
to happen in all tests - even in the cases listed by Mircea that the
testsuite is looking for something very specific, I would expect the
wait to be unnecessary. (or more precisely, to have been blocked
already for long enough)

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Separate ExecutorService for map/reduce tasks?

2012-12-05 Thread Vladimir Blagojevic

On 12-12-05 5:07 AM, Mircea Markus wrote:


On 5 Dec 2012, at 08:36, Tristan Tarrant wrote:


In 6.0 I would really like to go away from the current executor
configuration (e.g. a specific element for every executor) and allow the
creation of named executors (this is how the AS configuration works).


So that you can refer to the same executor from multiple places?


Yeah, Tristan, can you elaborate a bit more, I am now curious too!
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Dan Berindei
On Wed, Dec 5, 2012 at 4:20 PM, Sanne Grinovero sa...@infinispan.orgwrote:

 On 5 December 2012 14:01, Bela Ban b...@redhat.com wrote:
 
  On 12/5/12 1:23 PM, Sanne Grinovero wrote:
  On 5 December 2012 11:02, Galder Zamarreño gal...@redhat.com wrote:
  On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org
 wrote:
 
  On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com wrote:
  Hey Dan/Adrian,
 
  Re: https://issues.jboss.org/browse/ISPN-2541
 
  I'm looking at this intermittent failure, and it seems to be caused
 by the fact that the test does not wait for the cluster to be formed when
 the new node is started, which can lead a replication timeout failure from
 the new joining node.
 
  The test can easily be fixed by waiting for cluster to form, and
 then do the call.
 
  [...]
 
  I don't think the cache should ever be in an illegal state to be used
  after being started. So Infinispan should not require tests to wait
  for a cluster to be formed, I'd rather guarantee that after a cache
  is started it's usable.
  Precisely, which is why I raised the flag instead of going down the
 easy path.
 
  If this is not possible, then any application would also need to wait
  for that cluster formed event, and we should expose an API for that.
  The problem is considering when a cluster is formed. How many nodes
 should you wait for?
 
  Why can't we rely on JGroups Discovery to know that, as a user I
  already specified the expected initial group size with
  num_initial_members
  Don't want to repeat that configuration ;-)
 
 
  I don't understand this discussion: when a new node join, it'll return
  from JChannel.connect() when it received a JOIN response from the
  coordinator, with the current view... or are you guys talking about
  Infinispan's 'service views' ?

 +1

 That's why I'm confused too, and not understanding how it is possible
 that a Cache is returned to the application - which doesn't have a
 clue about number of expected nodes - in a state for which the
 cluster is not formed yet. That should never happen!?


It's simple: getCache() returns once the joiner has received ownership of
some segments (in distributed mode) and once it received all the data it
owner (dist and repl). This does not guarantee that the other nodes see the
joiner as a full member at the time getCache() has returned.

This doesn't mean that the cache is not functional, on the contrary we
could return even before the joiner had received the data and the cache
would still work. But because some nodes think state transfer is still in
progress, the tests do run into state transfer corner cases that aren't
handled properly (they're getting rarer, but we still have them).



 I never understood why the test framework in Infinispan requires this
 to happen in all tests - even in the cases listed by Mircea that the
 testsuite is looking for something very specific, I would expect the
 wait to be unnecessary. (or more precisely, to have been blocked
 already for long enough)


getCache() only waits enough for the cache to work, it doesn't wait (and
I don't think it should wait) for all the other nodes to acknowledge the
joiner as a full member (i.e. in the read consistent hash). Because of
this, assertions made on nodes other than the joiner can fail (in addition
to the aforementioned corner cases in state transfer).

It's also possible (and it was quite likely with older JGroups versions)
that a joiner would actually form a new cluster by itself instead of
joining the existing nodes in a single cluster. When that happens,
getCache() definitely returns without the cluster being formed, and we have
to wait for the separate clusters to find each other and merge before
running our test.

Cheers
Dan
___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Sanne Grinovero
So to make sure I understood that, this has no visible impact on the
functionality of API methods, correct? Like any get operation would
successfully retrieve a remote entry if one exists somewhere?

On 5 December 2012 15:42, Dan Berindei dan.berin...@gmail.com wrote:

 On Wed, Dec 5, 2012 at 4:20 PM, Sanne Grinovero sa...@infinispan.org
 wrote:

 On 5 December 2012 14:01, Bela Ban b...@redhat.com wrote:
 
  On 12/5/12 1:23 PM, Sanne Grinovero wrote:
  On 5 December 2012 11:02, Galder Zamarreño gal...@redhat.com wrote:
  On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org
  wrote:
 
  On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com wrote:
  Hey Dan/Adrian,
 
  Re: https://issues.jboss.org/browse/ISPN-2541
 
  I'm looking at this intermittent failure, and it seems to be caused
  by the fact that the test does not wait for the cluster to be formed 
  when
  the new node is started, which can lead a replication timeout failure 
  from
  the new joining node.
 
  The test can easily be fixed by waiting for cluster to form, and
  then do the call.
 
  [...]
 
  I don't think the cache should ever be in an illegal state to be used
  after being started. So Infinispan should not require tests to wait
  for a cluster to be formed, I'd rather guarantee that after a cache
  is started it's usable.
  Precisely, which is why I raised the flag instead of going down the
  easy path.
 
  If this is not possible, then any application would also need to wait
  for that cluster formed event, and we should expose an API for
  that.
  The problem is considering when a cluster is formed. How many nodes
  should you wait for?
 
  Why can't we rely on JGroups Discovery to know that, as a user I
  already specified the expected initial group size with
  num_initial_members
  Don't want to repeat that configuration ;-)
 
 
  I don't understand this discussion: when a new node join, it'll return
  from JChannel.connect() when it received a JOIN response from the
  coordinator, with the current view... or are you guys talking about
  Infinispan's 'service views' ?

 +1

 That's why I'm confused too, and not understanding how it is possible
 that a Cache is returned to the application - which doesn't have a
 clue about number of expected nodes - in a state for which the
 cluster is not formed yet. That should never happen!?


 It's simple: getCache() returns once the joiner has received ownership of
 some segments (in distributed mode) and once it received all the data it
 owner (dist and repl). This does not guarantee that the other nodes see the
 joiner as a full member at the time getCache() has returned.

 This doesn't mean that the cache is not functional, on the contrary we could
 return even before the joiner had received the data and the cache would
 still work. But because some nodes think state transfer is still in
 progress, the tests do run into state transfer corner cases that aren't
 handled properly (they're getting rarer, but we still have them).



 I never understood why the test framework in Infinispan requires this
 to happen in all tests - even in the cases listed by Mircea that the
 testsuite is looking for something very specific, I would expect the
 wait to be unnecessary. (or more precisely, to have been blocked
 already for long enough)


 getCache() only waits enough for the cache to work, it doesn't wait (and I
 don't think it should wait) for all the other nodes to acknowledge the
 joiner as a full member (i.e. in the read consistent hash). Because of
 this, assertions made on nodes other than the joiner can fail (in addition
 to the aforementioned corner cases in state transfer).

 It's also possible (and it was quite likely with older JGroups versions)
 that a joiner would actually form a new cluster by itself instead of joining
 the existing nodes in a single cluster. When that happens, getCache()
 definitely returns without the cluster being formed, and we have to wait for
 the separate clusters to find each other and merge before running our test.

 Cheers
 Dan


 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Separate ExecutorService for map/reduce tasks?

2012-12-05 Thread Tristan Tarrant

Yes.

Ideally I would like to have:

GlobalConfigurationBuilder global = new GlobalConfigurationBuilder();
global
.addExecutor().name(blah);
.addScheduledExecutor().name(sched);

Configuration config = new Configuration();
config
.clustering().async().replQueueExecutor(blah)
.eviction().executor(sched);

Don't take the above as a proposed API, it's just to make things clearer.

Tristan


On 12/05/2012 03:55 PM, Vladimir Blagojevic wrote:

On 12-12-05 5:07 AM, Mircea Markus wrote:


On 5 Dec 2012, at 08:36, Tristan Tarrant wrote:


In 6.0 I would really like to go away from the current executor
configuration (e.g. a specific element for every executor) and allow the
creation of named executors (this is how the AS configuration works).


So that you can refer to the same executor from multiple places?


Yeah, Tristan, can you elaborate a bit more, I am now curious too!


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev


___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Put issues with newly joining node

2012-12-05 Thread Dan Berindei
Yes, no visible impact.


On Wed, Dec 5, 2012 at 5:46 PM, Sanne Grinovero sa...@infinispan.orgwrote:

 So to make sure I understood that, this has no visible impact on the
 functionality of API methods, correct? Like any get operation would
 successfully retrieve a remote entry if one exists somewhere?

 On 5 December 2012 15:42, Dan Berindei dan.berin...@gmail.com wrote:
 
  On Wed, Dec 5, 2012 at 4:20 PM, Sanne Grinovero sa...@infinispan.org
  wrote:
 
  On 5 December 2012 14:01, Bela Ban b...@redhat.com wrote:
  
   On 12/5/12 1:23 PM, Sanne Grinovero wrote:
   On 5 December 2012 11:02, Galder Zamarreño gal...@redhat.com
 wrote:
   On Dec 4, 2012, at 10:22 AM, Sanne Grinovero sa...@infinispan.org
   wrote:
  
   On 4 December 2012 09:14, Galder Zamarreño gal...@redhat.com
 wrote:
   Hey Dan/Adrian,
  
   Re: https://issues.jboss.org/browse/ISPN-2541
  
   I'm looking at this intermittent failure, and it seems to be
 caused
   by the fact that the test does not wait for the cluster to be
 formed when
   the new node is started, which can lead a replication timeout
 failure from
   the new joining node.
  
   The test can easily be fixed by waiting for cluster to form, and
   then do the call.
  
   [...]
  
   I don't think the cache should ever be in an illegal state to be
 used
   after being started. So Infinispan should not require tests to wait
   for a cluster to be formed, I'd rather guarantee that after a
 cache
   is started it's usable.
   Precisely, which is why I raised the flag instead of going down the
   easy path.
  
   If this is not possible, then any application would also need to
 wait
   for that cluster formed event, and we should expose an API for
   that.
   The problem is considering when a cluster is formed. How many nodes
   should you wait for?
  
   Why can't we rely on JGroups Discovery to know that, as a user I
   already specified the expected initial group size with
   num_initial_members
   Don't want to repeat that configuration ;-)
  
  
   I don't understand this discussion: when a new node join, it'll return
   from JChannel.connect() when it received a JOIN response from the
   coordinator, with the current view... or are you guys talking about
   Infinispan's 'service views' ?
 
  +1
 
  That's why I'm confused too, and not understanding how it is possible
  that a Cache is returned to the application - which doesn't have a
  clue about number of expected nodes - in a state for which the
  cluster is not formed yet. That should never happen!?
 
 
  It's simple: getCache() returns once the joiner has received ownership of
  some segments (in distributed mode) and once it received all the data it
  owner (dist and repl). This does not guarantee that the other nodes see
 the
  joiner as a full member at the time getCache() has returned.
 
  This doesn't mean that the cache is not functional, on the contrary we
 could
  return even before the joiner had received the data and the cache would
  still work. But because some nodes think state transfer is still in
  progress, the tests do run into state transfer corner cases that aren't
  handled properly (they're getting rarer, but we still have them).
 
 
 
  I never understood why the test framework in Infinispan requires this
  to happen in all tests - even in the cases listed by Mircea that the
  testsuite is looking for something very specific, I would expect the
  wait to be unnecessary. (or more precisely, to have been blocked
  already for long enough)
 
 
  getCache() only waits enough for the cache to work, it doesn't wait
 (and I
  don't think it should wait) for all the other nodes to acknowledge the
  joiner as a full member (i.e. in the read consistent hash). Because of
  this, assertions made on nodes other than the joiner can fail (in
 addition
  to the aforementioned corner cases in state transfer).
 
  It's also possible (and it was quite likely with older JGroups versions)
  that a joiner would actually form a new cluster by itself instead of
 joining
  the existing nodes in a single cluster. When that happens, getCache()
  definitely returns without the cluster being formed, and we have to wait
 for
  the separate clusters to find each other and merge before running our
 test.
 
  Cheers
  Dan
 
 
  ___
  infinispan-dev mailing list
  infinispan-dev@lists.jboss.org
  https://lists.jboss.org/mailman/listinfo/infinispan-dev

 ___
 infinispan-dev mailing list
 infinispan-dev@lists.jboss.org
 https://lists.jboss.org/mailman/listinfo/infinispan-dev

___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

[infinispan-dev] 5.2.0.Beta6 schedule

2012-12-05 Thread Mircea Markus
Hi,

Beta6 will be cut on 13 Dec. Here's the list[1] of bugs scheduled: 
http://goo.gl/ILjeM
Also just a heads up, CR1 is scheduled for 21 Dec.

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev

Re: [infinispan-dev] Separate ExecutorService for map/reduce tasks?

2012-12-05 Thread Mircea Markus

On 5 Dec 2012, at 15:53, Tristan Tarrant wrote:

 GlobalConfigurationBuilder global = new GlobalConfigurationBuilder();
 global
 .addExecutor().name(blah);
 .addScheduledExecutor().name(sched);
 
 Configuration config = new Configuration();
 config
 .clustering().async().replQueueExecutor(blah)
 .eviction().executor(sched);


+1

Cheers,
-- 
Mircea Markus
Infinispan lead (www.infinispan.org)




___
infinispan-dev mailing list
infinispan-dev@lists.jboss.org
https://lists.jboss.org/mailman/listinfo/infinispan-dev