Hi Stan, I'm 100% for this activity, however I don't think we should change the behavior of timeouts you listed in #2 - this can lead to unexpected behavior for users who already use them. I would just deprecate them and eventually remove.
-Val On Mon, May 28, 2018 at 1:29 PM, Stanislav Lukyanov <[email protected]> wrote: > Hi folks, > > It looks like we stopped half-way with this activity. I’d like to pick it > up. > > All seem to agree that we should simplify the timeout settings. > Here are the specific actions I’d like to propose: > > 1. Promote the use of global timeouts as the best practice > *What*: update the docs to encourage users to rely on the following > timeouts for their “network stability” settings > IgniteConfiguration.failureDetectionTimeout > IgniteConfiguration.clientFailureDetectionTimeout > IgniteConfiguration.networkTimeout > *When*: update readme.io docs for 2.5 and Javadoc for 2.6 > > 2. Discourage the use of finer timeouts > *What*: > - update the docs to discourage users to use the following timeouts and > announce their upcoming deprecation and removal > TcpDiscoverySpi.socketTimeout > TcpDiscoverySpi.ackTimeout > TcpDiscoverySpi.maxAckTimeout > TcpDiscoverySpi.reconnectCount > TcpCommunicationSpi.connectTimeout > TcpCommunicationSpi.maxConnectTimeout > TcpCommunicationSpi.reconnectCount > - deprecate the properties in code > - remove the properties in code > *When*: > - readme.io update with deprecation announcement for 2.5 > - @Deprecated in code + Javadoc update + respective readme.io rewording > for 2.6 > - properties removal in 3.0 > > 3. Make “orphan” timeouts rely on global timeouts, then deprecate and > remove > *What*: > Two settings currently don’t default to the global equivalents, although > they should: > - TcpCommunicationSpi.socketWriteTimeout should default to > failureDetectionTimeout > - TcpDiscoverySpi.networkTimeout should default to IgniteConfiguration. > networkTImeout > So the course of action would be: > - update the docs to explain that these timeouts have to be used for now, > but announce their upcoming deprecation and removal > - change the properties to default to their global counterparts and > deprecate them in code > - remove the properties in code > *When*: > - readme.io update with deprecation announcement for 2.5 > - changing defaults + @Deprecated in code + Javadoc update + respective > readme.io rewording for 2.6 > - properties removal in 3.0 > > 4. Don’t touch other timeouts > Other timeouts, like TcpDiscoverySpi.joinTimeout or > TcpCommunicationSpi.idleConnectionTimeout, > are orthogonal to the whole > “network stability” theme discussed above, and don’t have to be changed. > > Finally, I’ve prepared a draft of the docs page that may be used as a base > for the readme.io update. > This email is pretty long already, so please find the draft attached to > the JIRA issue > https://issues.apache.org/jira/browse/IGNITE-7704. > > Please share your thoughts. > > Thanks, > Stan > > From: Alexey Popov > Sent: 1 марта 2018 г. 17:01 > To: [email protected] > Subject: IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts > > Hi Igniters, > > We often see similar questions from users and customers related to > IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts and > their > relations. And we see several side-effects after incorrect timeout > configuration. > > I tried to briefly describe these timeout settings (please see below) and > found out that the most of them do not have sense in terms of cluster > functions/operations and could not be explained to the users. > > I propose to deprecate most of them and leave only the timeouts we can > explain in common terms ( (setFailureDetectionTimeout, setNetworkTimeout, > setJoinTimeout and some others). > > Please let me know your thoughts. > > Thanks, > Alexey > > GLOBAL: > > IgniteConfiguration.setNetworkTimeout: > It is a global timeout for high-level operations where a network is > involved. For instance, IgniteMessaging delivery uses this timeout or > DiscoverySpi handshake. > > IgniteConfiguration.setFailureDetectionTimeout: > It is a global timeout for detecting failures at IgniteSpi implementations > (including DiscoverySpi and CommunicationSpi). > The failure detection algorithm actually limits a range of simple network > operations related to a single logical operation (for instance, a reliable > delivery of some DiscoverySpi message within a cluster). > Failure detection timeout is a cumulative timeout for a socket connection, > sending and receiving data bytes and all possible socket retries (if some > failure happens). > This timeout is intended to simplify the failure detection condition from a > user perspective. > > IgniteConfiguration.setClientFailureDetectionTimeout: - it is a special > case > for DiscoverySpi client-node Ignite. > > TCP DISCOVERY SPI: > > If you need more control over failure detection algorithm for > TcpDiscoverySpi you can explicitly use the following low-level options > (that > will disable failureDetectoinTimeout logic): > > 1. TcpDiscoverySpi.setConnectTimeout - socket connection timeout > 2. TcpDiscoverySpi.setReconnectCount - number of reconnect attempts used > when establishing connection with the remote node and sending messages to > it > 3. TcpDiscoverySpi.setSocketTimeout - socket write timeout. The write > operation will be repeated getReconnectCount() times if it exceeds this > timeout > 4. TcpDiscoverySpi.setAckTimeout - message acknowledgment timeout. If a > message acknowledgment is not received within this timeout, sending is > considered as failed and SPI will try to repeat send operation. It is > automatically doubled for simultaneous retries up to getMaxAckTimeout > value. > 5. TcpDiscoverySpi.setMaxAckTimeout - maximum connection timeout, if the > getAckTimeout reaches getMaxAckTimeout then SPI give up sending retries > > Another important TcpDiscoverySpi timeouts: > > TcpDiscoverySpi.setJoinTimeout - It is a timeout for join process when a > new/restarted node joins a cluster. The node tries to connect to all > available IP addresses provided by ipFinder within this timeout. > If the timeout is exceeded, the node will give up and throw an exception > from Ignition.start(). > > TcpDiscoverySpi.setNetworkTimeout - timeout for high-level operations like > handshake. It looks like it should be deprecated and the > IgniteConfiguration.getNetworkTimeout should be used here. > > TCP COMMUNICATION SPI: > > If you need more control over failure detection algorithm for > TcpCommunicationSpi you can explicitly use the following low-level options > (that will disable failureDetectoinTimeout logic): > > 1. TcpCommunicationSpi.setConnectTimeout - socket connection timeout, will > be automatically doubled for simultaneous retries (up to getReconnectCount) > related to a single logical operation > 2. TcpCommunicationSpi.setMaxConnectTimeout - maximum connection timeout, > the higher limit of getReconnectCount-times doubled getConnectTimeout > 3. TcpCommunicationSpi.setReconnectCount - number of reconnect attempts > used > when establishing connection with the remote node and sending messages to > it > > Another important TcpCommunicationSpi timeouts: > > TcpDiscoverySpi.setSocketWriteTimeout - timeout to send a message. > TcpDiscoverySpi.setIdleConnectionTimeout - maximum idle connection timeout > upon which a connection will be closed. > > > > > -- > Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/ > >
