John,
You can use the ClientMembershipListener in the client like:
1. Register ClientMembershipListener before before initializing
the ClientCache. This will ensure that the client gets callbacks for
servers that connect when its pool is pre-filled with connections.
2. Initialize ClientCache
3. Wait for server to start
4. When notified, continue
In my test, I used a CountDownLatch known to both the TestClient and
ClientMembershipListener to wait and notify.
In my TestClient, I defined waitForServer (step 3) like:
private void waitForServer() throws InterruptedException {
System.out.println("Waiting for server");
this.latch.await();
System.out.println("Done waiting for server");
}
In my ClientMembershipListener, I defined memberJoined (step 4) like:
public void memberJoined(ClientMembershipEvent event) {
if (!event.isClient()) {
this.latch.countDown();
}
}
If I start the TestClient before the server starts, it waits in
waitForServer. Once the server starts, it gets notified and continues. If I
start the TestClient after the server starts, it just sails through the
waitForServer.
Barry Oglesby
GemFire Advanced Customer Engineering (ACE)
For immediate support please contact Pivotal Support at
http://support.pivotal.io/
On Wed, Jan 20, 2016 at 6:59 PM, John Blum <[email protected]> wrote:
> Hi Barry-
>
> Thank you for the quick response.
>
> In my test class, I only start one (standalone) server (i.e. no locators,
> no other servers, etc) during setup, and the client (test) connects
> directly to that server. Unfortunately, without explicit coordination, it
> is entirely possible that the client will start first and attempt to
> connect while performing a cache operation before the server is fully
> started/initialized (and technically listening for/accepting client
> connections).
>
> So, unless there is retry logic in the Pool to try and connect until X
> number of attempts at Y intervals has been made, I am not sure how the
> ClientMembershipListener will work in this case, especially since the
> client has not connected (to anything) yet.
>
>
> *1. How do you register a ClientMembershipListener on the client?*
>
> I see that it can be registered with the ClientMembership
> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembership.html#registerClientMembershipListener(com.gemstone.gemfire.management.membership.ClientMembershipListener)>
> [0]
> class, but that appears to be a management component used on the server.
> However, based on the *Javadoc* description (for memberJoined(event)
> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembershipListener.html#memberJoined(com.gemstone.gemfire.management.membership.ClientMembershipEvent)>
> [1])...
>
> *"Invoked when a client has connected to this process or when this process
> has connected to a CacheServer."*
>
> It does imply the ClientMembershipListener can be registered and used on
> the client, assuming the 2nd reference to "this process" actually refers to
> the client.
>
>
> 2. Then, I was thinking, if a ClientMembershipListener can somehow be
> registered on the client (where?), that I could potentially implement
> memberJoined(event)
> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembershipListener.html#memberJoined(com.gemstone.gemfire.management.membership.ClientMembershipEvent)>
> to
> block any client cache operations (implemented in the test) until the
> client has actually connected. I assume this is what you mean by...
>
> *"You could possibly use a ClientMembershipListener. If you install one in
> your client, the memberJoined callback will tell you when the client
> connects to the server."*
>
> However, I do not see where to register the ClientMembershipListener; the
> ClientCacheFactory nor the PoolFactory has any such API to perform the
> registration, and presumably, this would need to be done prior to
> connecting in order to receive the event when the client does, finally,
> successfully connect.
>
>
> 3. Or, I was also thinking, based on the "*How the Pool Connects to a
> Server*" section in the GemFire User Guide here
> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/topologies_and_comm/topology_concepts/how_the_pool_manages_connections.html>,
> that may also be feasible to use a combination of freeConnectionTimeout
> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setFreeConnectionTimeout(int)>
> [2] with
> pingIntervals
> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setPingInterval(long)>
> [3]
> (and perhaps, readTimeout
> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int)>
> [4])
> to delay the cache operation until the server become available. Though, my
> thinking here may be off basis, and this approach seems less reliable given
> the time-dependent, race condition nature of it.
>
>
> Thanks,
> John
>
> [0] -
> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembership.html#registerClientMembershipListener(com.gemstone.gemfire.management.membership.ClientMembershipListener)
> [1] -
> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/management/membership/ClientMembershipListener.html#memberJoined(com.gemstone.gemfire.management.membership.ClientMembershipEvent)
> [2] -
> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setFreeConnectionTimeout(int)
> [3] -
> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setPingInterval(long)
> [4] -
> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int)
>
>
> On Wed, Jan 20, 2016 at 4:11 PM, Barry Oglesby <[email protected]>
> wrote:
>
>> John,
>>
>> CacheServer isRunning is not a reliable way to determine whether the
>> CacheServer acceptor is actually listening for connections.
>> BridgeServerImpl isRunning (the implementation) asks if the Acceptor is
>> non-null and isRunning, which in turn just asks whether it (the Acceptor)
>> is not shutdown. The CacheServer isRunning could be true before the
>> Acceptor is listening for connections.
>>
>> You could possibly use a ClientMembershipListener. If you install one in
>> your client, the memberJoined callback will tell you when the client
>> connects to the server. This will more-or-less do what your custom socket
>> code is doing now. It doesn't necessarily tell you all the servers though -
>> only the ones that the client has connected to.
>>
>> There is another option to see all the servers. It only works if you have
>> a locator, and it uses some java public (but not Geode public) API. This
>> API can be used by the client to determine how many servers there are and
>> their locations. I can point you to that if you're interested.
>>
>>
>> Barry Oglesby
>> GemFire Advanced Customer Engineering (ACE)
>> For immediate support please contact Pivotal Support at
>> http://support.pivotal.io/
>>
>>
>> On Wed, Jan 20, 2016 at 3:28 PM, John Blum <[email protected]> wrote:
>>
>>> Is there a recommended, (more) reliable means to determine whether a
>>> CacheServer (listening for cache clients) has successfully started in a
>>> GemFire server from the client-side?
>>>
>>> Currently, I am employing a form of inter-process communication (e.g.
>>> control file) to coordinate the successful startup and general readiness of
>>> a server before a client cache attempts to connect inside an integration
>>> test.
>>>
>>> In this case, the test acts as the cache client and connects to the
>>> server, but not before forking a GemFire server process during setup, and
>>> ideally not before the server is ready (and specifically, not until
>>> ServerSocket is "accepting" connections).
>>>
>>> For the most part, this works fairly consistently, except there exists
>>> potential timing issues in the test for server readiness (and specifically,
>>> CacheServer listening for connections), particularly on the server
>>> before writing the control file. For example, I have included this code
>>> block...
>>>
>>> assertThat(*waitOnCondition*(new Condition() {
>>> @Override public boolean evaluate() {
>>> * return gemfireCacheServer.isRunning();*
>>> }
>>> }), is(true));
>>>
>>> writeProcessControlFile(WORKING_DIRECTORY);
>>>
>>> The client (i.e. test) then checks for the presence of this control file
>>> before executing the tests.
>>>
>>> The waitOnCondition(:Condition) method (see below) functions properly,
>>> waiting on the condition for a specified duration (defaults to 20 seconds),
>>> checking every 500 ms. However, it would seem CacheServer.isRunning()
>>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/server/CacheServer.html#isRunning()>
>>> [0] can
>>> potentially return *true* before the ServerSocket listening for client
>>> connections is actually "accepting" connections. It is less than clear
>>> from the Javadoc, (and thus, the user's POV) what
>>> CacheServer.isRunning() actually does (without having to dig into code).
>>>
>>> So, I thought, perhaps a more reliable means to determine whether the
>>> server is actually ready, listening for and accepting connections, would be
>>> to just open a Socket connection on the client. If I can connect, then
>>> the server is presumably ready. So, I coded...
>>>
>>> boolean waitForCacheServerToStart(final String host,
>>> final int port, long duration) {
>>> return *waitOnCondition*(new Condition() {
>>> AtomicBoolean connected = new AtomicBoolean(false);
>>>
>>> public boolean evaluate() {
>>> Socket socket = null;
>>>
>>> try {
>>> // NOTE: the following code is not meant to be an atomic,
>>> compound action (a possible race condition)
>>> // opening another connection (at the expense of using
>>> system resources) after connectivity
>>> // has already been established is not detrimental in this
>>> use case
>>> if (!connected.get()) {
>>> * socket = new Socket(host, port);*
>>> connected.set(true);
>>> }
>>> }
>>> catch (IOException ignore) {
>>> }
>>> finally {
>>> GemFireUtils.close(socket);
>>> }
>>>
>>> return connected.get();
>>> }
>>> }, duration);
>>> }
>>>
>>> This seems to work OK, though, since I turn around and close the
>>> connection right of way, before completing the "handshake", Geode throws...
>>>
>>> [warn 2016/01/20 14:12:42.599 PST <Handshaker localhost/127.0.0.1:12480
>>> Thread 0> tid=0x22] Bridge server: failed accepting client connection {0}
>>> java.io.EOFException
>>> at
>>>
>>> com.gemstone.gemfire.internal.cache.tier.sockets.AcceptorImpl.handleNewClientConnection(AcceptorImpl.java:1508)
>>> at
>>> com.gemstone.gemfire.internal.cache.tier.sockets.AcceptorImpl$5.run(AcceptorImpl.java:1391)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>>> at
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>>> at java.lang.Thread.run(Thread.java:745)
>>>
>>>
>>> There really does not appear to be a better way using the Geode API, and
>>> in particular, the PoolFactory
>>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html>
>>> [1],
>>> to set, say, a *retryConnectionTimeout* property along with a
>>> *retryConnectionAttempts* property when populating the pool with
>>> connections, at least initially during startup, or even when adding more
>>> connections to the pool (up to the "max") during heavier loads, unlike
>>> similar properties for read/requests operations... setReadTimeout(:int)
>>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int)>
>>> [2]
>>> and setRetryAttempts(:int)
>>> <http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setRetryAttempts(int)>
>>> [3].
>>>
>>> Am I missing anything? Other ideas/recommendations?
>>>
>>> Thanks,
>>> -John
>>>
>>> [0] -
>>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/server/CacheServer.html#isRunning()
>>> [1] -
>>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html
>>> [2] -
>>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setReadTimeout(int)
>>> [3] -
>>> http://gemfire.docs.pivotal.io/docs-gemfire/latest/javadocs/japi/com/gemstone/gemfire/cache/client/PoolFactory.html#setRetryAttempts(int)
>>>
>>>
>>> P.S. code for waitOnCondition(..) for the curious minded, ;-)
>>>
>>> static final long DEFAULT_WAIT_DURATION = TimeUnit.SECONDS.toMillis(20);
>>> static final long DEFAULT_WAIT_INTERVAL = 500l;
>>>
>>> @SuppressWarnings("unused")
>>> boolean waitOnCondition(Condition condition) {
>>> return waitOnCondition(condition, DEFAULT_WAIT_DURATION);
>>> }
>>>
>>> @SuppressWarnings("all")
>>> boolean waitOnCondition(Condition condition, long duration) {
>>> final long timeout = (System.currentTimeMillis() + duration);
>>>
>>> try {
>>> while (!condition.evaluate() && System.currentTimeMillis() <
>>> timeout) {
>>> synchronized (condition) {
>>> TimeUnit.MILLISECONDS.timedWait(condition,
>>> DEFAULT_WAIT_INTERVAL);
>>> }
>>> }
>>> }
>>> catch (InterruptedException e) {
>>> Thread.currentThread().interrupt();
>>> }
>>>
>>> return condition.evaluate();
>>> }
>>>
>>>
>>
>
>
> --
> -John
> 503-504-8657
> john.blum10101 (skype)
>