Thanks a lot for the information, Barry.

Alberto
________________________________
From: Barry Oglesby <bogle...@vmware.com>
Sent: Friday, April 15, 2022 7:42 PM
To: dev@geode.apache.org <dev@geode.apache.org>; u...@geode.apache.org 
<u...@geode.apache.org>
Subject: Re: On conserve-sockets=true with WAN and/or transactions - Follow-up 
on April's Geode Community Meeting

Alberto,

I can only speak to the WAN question in your email. The conserve-sockets 
setting was (or is) a limitation on serial WAN, but I just ran a few tests, and 
it is not deadlocking. Its been a while since I've tried serial WAN with 
conserve-sockets=true, but I'm pretty sure a test with several servers in each 
site and a multi-threaded client doing puts would cause the deadlock. That is 
not happening in my tests. We would need way more than a few simple tests to 
prove that it doesn't deadlock in other scenarios, though.

Barry
________________________________
From: Alberto Gomez <alberto.go...@est.tech>
Sent: Friday, April 8, 2022 4:17 AM
To: dev@geode.apache.org <dev@geode.apache.org>; u...@geode.apache.org 
<u...@geode.apache.org>
Subject: On conserve-sockets=true with WAN and/or transactions - Follow-up on 
April's Geode Community Meeting

⚠ External Email

Hi,

Following up on the discussion we had yesterday in the Apache Geode Community 
meeting around the "Reflections on conserve-sockets setting in Apache Geode" 
topic, I'd like to post here some questions that could not be fully answered 
during the meeting:

The Geode documentation states the following about conserve-sockets and WAN 
deployments in [1]:
"WAN deployments increase the messaging demands on a Geode system. To avoid 
hangs related to WAN messaging, always set `conserve-sockets=false` for Geode 
members that participate in a WAN deployment."

It also states the following about conserve-sockets and transactions in [2]:
"When you have transactions operating on EMPTY, NORMAL or PARTITION regions, 
make sure that conserve-sockets is set to false to avoid distributed deadlocks."

Doing a search on the Geode tests, the only test case related to deadlocks with 
conserve-sockets=true that I have found is:
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fapache%2Fgeode%2Fblob%2F41eb49989f25607acfcbf9ac5afe3d4c0721bb35%2Fgeode-wan%2Fsrc%2FdistributedTest%2Fjava%2Forg%2Fapache%2Fgeode%2Finternal%2Fcache%2Fwan%2Fserial%2FSerialGatewaySenderDistributedDeadlockDUnitTest.java%23L176&amp;data=04%7C01%7Cboglesby%40vmware.com%7C99bf10e9a0504739006a08da1951657c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637850134638236362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=vnNkVWk0vTSjkAg1neUK91qe%2BwMyfoFyf9gabnT%2BKXs%3D&amp;reserved=0
According to the comments in the test, it always causes a distributed deadlock, 
and it is commented out. Nevertheless, the test case is actually NOT commented 
out and, in fact, if you execute it, you see it passing without any 
failure/deadlock.

And here the questions:

Could it be that deadlocks with conserve-sockets=true and WAN and/or 
transactions over partitioned regions was some legacy issue that has already 
been fixed?

Otherwise, could someone please provide some more information about why these 
deadlocks could happen? It would be great if there were test cases that 
showcase this possibility.

It looks like a big limitation of Geode that you are forced to set 
conserve-sockets to false (with the implications this has on resources usage) 
when you are using WAN replication and/or transactions on partitioned regions.

Could it be that there are other elements (for example also using 
CacheListeners as Anthony Baker pointed out) that would increase the risk of 
hitting a distributed deadlock?

Thanks in advance,

Alberto

[1]: 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgeode.apache.org%2Fdocs%2Fguide%2F114%2Fmanaging%2Fmonitor_tune%2Fsockets_and_gateways.html&amp;data=04%7C01%7Cboglesby%40vmware.com%7C99bf10e9a0504739006a08da1951657c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637850134638236362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=S%2B9DPPcFSrxIlCHtPFB0QUUVwT3fTcvHPapoP6vd97U%3D&amp;reserved=0

[2]: 
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgeode.apache.org%2Fdocs%2Fguide%2F114%2Fmanaging%2Fmonitor_tune%2Fperformance_controls_controlling_socket_use.html&amp;data=04%7C01%7Cboglesby%40vmware.com%7C99bf10e9a0504739006a08da1951657c%7Cb39138ca3cee4b4aa4d6cd83d9dd62f0%7C0%7C0%7C637850134638236362%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&amp;sdata=%2FtF2LJ7T6yLn%2FL0ZRySokjK8%2BOSUvXTV1BiFtNA2cpI%3D&amp;reserved=0

________________________________

⚠ External Email: This email originated from outside of the organization. Do 
not click links or open attachments unless you recognize the sender.

Reply via email to