Re: IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts

Valentin Kulichenko Mon, 28 May 2018 15:45:48 -0700

Hi Stan,

I'm 100% for this activity, however I don't think we should change the
behavior of timeouts you listed in #2 - this can lead to unexpected
behavior for users who already use them. I would just deprecate them and
eventually remove.


-Val

On Mon, May 28, 2018 at 1:29 PM, Stanislav Lukyanov <[email protected]>
wrote:

> Hi folks,
>
> It looks like we stopped half-way with this activity. I’d like to pick it
> up.
>
> All seem to agree that we should simplify the timeout settings.
> Here are the specific actions I’d like to propose:
>
> 1. Promote the use of global timeouts as the best practice
> *What*: update the docs to encourage users to rely on the following
> timeouts for their “network stability” settings
> IgniteConfiguration.failureDetectionTimeout
> IgniteConfiguration.clientFailureDetectionTimeout
> IgniteConfiguration.networkTimeout
> *When*: update readme.io docs for 2.5 and Javadoc for 2.6
>
> 2. Discourage the use of finer timeouts
> *What*:
> - update the docs to discourage users to use the following timeouts and
> announce their upcoming deprecation and removal
> TcpDiscoverySpi.socketTimeout
> TcpDiscoverySpi.ackTimeout
> TcpDiscoverySpi.maxAckTimeout
> TcpDiscoverySpi.reconnectCount
> TcpCommunicationSpi.connectTimeout
> TcpCommunicationSpi.maxConnectTimeout
> TcpCommunicationSpi.reconnectCount
> - deprecate the properties in code
> - remove the properties in code
> *When*:
> - readme.io update with deprecation announcement for 2.5
> - @Deprecated in code + Javadoc update + respective readme.io rewording
> for 2.6
> - properties removal in 3.0
>
> 3. Make “orphan” timeouts rely on global timeouts, then deprecate and
> remove
> *What*:
> Two settings currently don’t default to the global equivalents, although
> they should:
> - TcpCommunicationSpi.socketWriteTimeout should default to
> failureDetectionTimeout
> - TcpDiscoverySpi.networkTimeout should default to IgniteConfiguration.
> networkTImeout
> So the course of action would be:
> - update the docs to explain that these timeouts have to be used for now,
> but announce their upcoming deprecation and removal
> - change the properties to default to their global counterparts and
> deprecate them in code
> - remove the properties in code
> *When*:
> - readme.io update with deprecation announcement for 2.5
> - changing defaults + @Deprecated in code + Javadoc update + respective
> readme.io rewording for 2.6
> - properties removal in 3.0
>
> 4. Don’t touch other timeouts
> Other timeouts, like TcpDiscoverySpi.joinTimeout or 
> TcpCommunicationSpi.idleConnectionTimeout,
> are orthogonal to the whole
> “network stability” theme discussed above, and don’t have to be changed.
>
> Finally, I’ve prepared a draft of the docs page that may be used as a base
> for the readme.io update.
> This email is pretty long already, so please find the draft attached to
> the JIRA issue
> https://issues.apache.org/jira/browse/IGNITE-7704.
>
> Please share your thoughts.
>
> Thanks,
> Stan
>
> From: Alexey Popov
> Sent: 1 марта 2018 г. 17:01
> To: [email protected]
> Subject: IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts
>
> Hi Igniters,
>
> We often see similar questions from users and customers related to
> IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts and
> their
> relations. And we see several side-effects after incorrect timeout
> configuration.
>
> I tried to briefly describe these timeout settings (please see below) and
> found out that the most of them do not have sense in terms of cluster
> functions/operations and could not be explained to the users.
>
> I propose to deprecate most of them and leave only the timeouts we can
> explain in common terms ( (setFailureDetectionTimeout, setNetworkTimeout,
> setJoinTimeout and some others).
>
> Please let me know your thoughts.
>
> Thanks,
> Alexey
>
> GLOBAL:
>
> IgniteConfiguration.setNetworkTimeout:
> It is a global timeout for high-level operations where a network is
> involved. For instance, IgniteMessaging delivery uses this timeout or
> DiscoverySpi handshake.
>
> IgniteConfiguration.setFailureDetectionTimeout:
> It is a global timeout for detecting failures at IgniteSpi implementations
> (including DiscoverySpi and CommunicationSpi).
> The failure detection algorithm actually limits a range of simple network
> operations related to a single logical operation (for instance, a reliable
> delivery of some DiscoverySpi message within a cluster).
> Failure detection timeout is a cumulative timeout for a socket connection,
> sending and receiving data bytes and all possible socket retries (if some
> failure happens).
> This timeout is intended to simplify the failure detection condition from a
> user perspective.
>
> IgniteConfiguration.setClientFailureDetectionTimeout: - it is a special
> case
> for DiscoverySpi client-node Ignite.
>
> TCP DISCOVERY SPI:
>
> If you need more control over failure detection algorithm for
> TcpDiscoverySpi you can explicitly use the following low-level options
> (that
> will disable failureDetectoinTimeout logic):
>
> 1. TcpDiscoverySpi.setConnectTimeout - socket connection timeout
> 2. TcpDiscoverySpi.setReconnectCount - number of reconnect attempts used
> when establishing connection with the remote node and sending messages to
> it
> 3. TcpDiscoverySpi.setSocketTimeout - socket write timeout. The write
> operation will be repeated getReconnectCount() times if it exceeds this
> timeout
> 4. TcpDiscoverySpi.setAckTimeout - message acknowledgment timeout. If a
> message acknowledgment is not received within this timeout, sending is
> considered as failed and SPI will try to repeat send operation. It is
> automatically doubled for simultaneous retries up to getMaxAckTimeout
> value.
> 5. TcpDiscoverySpi.setMaxAckTimeout - maximum connection timeout, if the
> getAckTimeout reaches getMaxAckTimeout then SPI give up sending retries
>
> Another important TcpDiscoverySpi timeouts:
>
> TcpDiscoverySpi.setJoinTimeout - It is a timeout for join process when a
> new/restarted node joins a cluster. The node tries to connect to all
> available IP addresses provided by ipFinder within this timeout.
> If the timeout is exceeded, the node will give up and throw an exception
> from Ignition.start().
>
> TcpDiscoverySpi.setNetworkTimeout - timeout for high-level operations like
> handshake. It looks like it should be deprecated and the
> IgniteConfiguration.getNetworkTimeout should be used here.
>
> TCP COMMUNICATION SPI:
>
> If you need more control over failure detection algorithm for
> TcpCommunicationSpi you can explicitly use the following low-level options
> (that will disable failureDetectoinTimeout logic):
>
> 1. TcpCommunicationSpi.setConnectTimeout - socket connection timeout, will
> be automatically doubled for simultaneous retries (up to getReconnectCount)
> related to a single logical operation
> 2. TcpCommunicationSpi.setMaxConnectTimeout - maximum connection timeout,
> the higher limit of getReconnectCount-times doubled getConnectTimeout
> 3. TcpCommunicationSpi.setReconnectCount - number of reconnect attempts
> used
> when establishing connection with the remote node and sending messages to
> it
>
> Another important TcpCommunicationSpi timeouts:
>
> TcpDiscoverySpi.setSocketWriteTimeout - timeout to send a message.
> TcpDiscoverySpi.setIdleConnectionTimeout - maximum idle connection timeout
> upon which a connection will be closed.
>
>
>
>
> --
> Sent from: http://apache-ignite-developers.2346864.n4.nabble.com/
>
>

Re: IgniteConfiguration, TcpDiscoverySpi, TcpCommunicationSpi timeouts

Reply via email to