[
https://issues.apache.org/jira/browse/GEODE-6536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mario Ivanac updated GEODE-6536:
--------------------------------
Fix Version/s: (was: 1.13.0)
> client pr single hop will do multiple hops if things get busy
> -------------------------------------------------------------
>
> Key: GEODE-6536
> URL: https://issues.apache.org/jira/browse/GEODE-6536
> Project: Geode
> Issue Type: Improvement
> Components: client/server
> Reporter: Darrel Schneider
> Assignee: Mario Ivanac
> Priority: Major
> Labels: needs-review, performance, pull-request-available
> Time Spent: 1h 50m
> Remaining Estimate: 0h
>
> I was wondering why ConnectionManagerImpl.borrowConnection that targets a
> particular server will not wait for a connection to be available. This is the
> borrow used by pr single hop. It seems like if all the cnxs to that server
> are busy and we are at max that it should wait. But instead it does a single
> scan of the pool and if it does not find an idle connection it either throws
> AllConnectionsInUse or it creates one. The caller sets the onlyUseExistingCnx
> parameter based on if the pool is at max cnxs or not (seems like this should
> have been done in borrowConnection instead of the caller). If it throws
> AllConnectionsInUse the caller catches it and ignores it and then falls into
> doing a non-targeted borrow that will wait for a connection but to ANY
> server. So our single hop optimization goes away on a busy client. Since it
> can’t immediately get a connection to server A, it says just give me a
> connection to any server sends the op to that other server and that server
> ends up having to forward it to server A. You would only see this problem if
> you set a max on your client connection pool. The default of -1 just says to
> always create another connection. The only workaround I know of is to not
> limit how many connections your client can create but that could cause other
> problem (i.e. too many file descriptors).
> The questionable code is in the following classes:
> org.apache.geode.cache.client.internal.GetOp
> org.apache.geode.cache.client.internal.PutOp
> org.apache.geode.cache.client.internal.DestroyOp
> The all have something like this:
>
> {code:java}
> boolean onlyUseExistingCnx = ((poolImpl.getMaxConnections() != -1
> && poolImpl.getConnectionCount() >= poolImpl.getMaxConnections()) ? true :
> false);
> op.setAllowDuplicateMetadataRefresh(!onlyUseExistingCnx);
> return pool.executeOn(new ServerLocation(server.getHostName(),
> server.getPort()), op,
> true, onlyUseExistingCnx);
> {code}
> executeOn eventually calls ConnectionManagerImpl.borrowConnection
> It is unclear to me why they need allow duplicate metadata refresh if they
> permit borrowConnection to create a new connection to the specified server.
> They all catch and ignore
> AllConnectionsInUseException falling through and doing non-targetted
> pool.execute.
> The concern might have been that the pool would be full of connections to
> other servers preventing us from connecting to the server we need to
> communicate with. If we have existing connections to the server we could wait
> for one of those. If we don't and we are at max then we should consider
> closing a connection to some other server and create a connection to the
> server we know we need for this operation. As the server count goes up and
> the max connection count on the pool goes down I'm sure we are probably
> better off not even trying to do single hop since we might not even be able
> to have a connection to each server in the pool. But if we have a reasonable
> max connection count then it seems like we could make more of an effort to
> make sure a per server connection count is balanced and then wait for one of
> those connection to be available.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)