We are running into production issues with some clients unable to connect to
the grid (16 server nodes running on linux). The error is 

Caused by: class org.apache.ignite.spi.IgniteSpiException: Join process
timed out, did not receive response for join request (consider increasing
'joinTimeout' configuration property) [joinTimeout=5000, sock=null]
       at
org.apache.ignite.spi.discovery.tcp.ClientImpl$MessageWorker.body(ClientImpl.java:1334)
       at org.apache.ignite.spi.IgniteSpiThread.run(IgniteSpiThread.java:62)


DiscoverySpi has these settings
                <property name="joinTimeout" value="5000"/>
                <property name="ackTimeout" value="5000"/>
                <property name="maxAckTimeout" value="30000"/>            
                <property name="reconnectCount" value="5"/>

At the time, the clients got this error we tried increasing the timeout to
30 seconds and even 50 seconds, new client connections from some windows
machine just won't happen. We read
http://apache-ignite-users.70518.x6.nabble.com/Help-with-tuning-for-larger-clusters-td1692.html
 
[1]
<http://apache-ignite-users.70518.x6.nabble.com/Help-with-tuning-for-larger-clusters-td1692.html>
  
and got rid of joinTimeout and started using networkTimeout. It seems to be
working this way so far (have not yet pushed to production).

When we specify joinTimeout along with networkTimeout, we still cannot
connect.

Question 1) What is the difference between these 2 settings - join and
network timeout. 
Question 2) Without a joinTimeout in test environment, if the cluster is
down the client hangs forever (because joinTimeout is infinite), how do we
make sure that the client still proceeds even if it could not connect to the
cluster. We need clients to proceed in testing environment even if the grid
is down.
Question 3) both clients and servers are using TcpDiscoverySPI - is that
right? Should we be using TcpCommunicationSPI instead? 

Thanks,
Binti




--
View this message in context: 
http://apache-ignite-users.70518.x6.nabble.com/Client-fails-to-connect-joinTimeout-vs-networkTimeout-tp4419.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Reply via email to