[ 
https://issues.apache.org/jira/browse/GEODE-4180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16308924#comment-16308924
 ] 

Bruce Schuchardt commented on GEODE-4180:
-----------------------------------------

This seems to be a problem with the ClusterStartupRule.  I modified it and 
LocatorStarterRule to list the files in the workingDir from the controller and 
the current working directory in the locator VM and saw the problem.  The first 
two listings of the directory's contents use an absolute path to create a File 
and list its contents.  The last listing uses 'new File(".")' to list the 
contents of the directory:

listing 1 in the ClusterStartupRule that creates the workingDir
{noformat}
[vm0] bjs: creating working dir 
/var/folders/f6/xzqg10fn7yv5xlpkdpnb3wy80000gq/T/junit2127669778675404869/locator-0
[vm0] bjs: files in working dir for member are 
(dir=/var/folders/f6/xzqg10fn7yv5xlpkdpnb3wy80000gq/T/junit2127669778675404869/locator-0/.)
{noformat}

listing 2 in ClusterStartupRule after setting user.dir
{noformat}
[vm0] bjs: files in working directory after setting withWorkingDir on locator 
(dir=/var/folders/f6/xzqg10fn7yv5xlpkdpnb3wy80000gq/T/junit2127669778675404869/locator-0/.)
{noformat}

listing 3, in LocatorStartupRule.before(), listing directory contents with 'new 
File(".")'
{noformat}
[vm0] bjs: files in current directory before starting locator 
(dir=/var/folders/f6/xzqg10fn7yv5xlpkdpnb3wy80000gq/T/junit2127669778675404869/locator-0/.)
[vm0] locator0view.dat
[vm0] locator0views.log
[vm0] locator54799views.log
{noformat}

Then the locator starts up and recovers from the locator0view.dat file

{noformat}
[vm0] [info 2018/01/02 16:14:03.919 PST <RMI TCP Connection(1)-10.118.20.16> 
tid=20] Peer locator recovering from 
/var/folders/f6/xzqg10fn7yv5xlpkdpnb3wy80000gq/T/junit2127669778675404869/locator-0/locator0view.dat

[vm0] [info 2018/01/02 16:14:03.939 PST <RMI TCP Connection(1)-10.118.20.16> 
tid=20] Peer locator recovered membership is 
View[10.118.20.16(locator-0:28192:locator)<ec><v0>:32770|-1] members: 
[10.118.20.16(server-1:28193)<v1>:32771{lead}]
{noformat}

I think that setting the user.dir property isn't enough.  If you create a 
File(".") you can see what's going wrong.  It's absolute path will report that 
it's using user.dir but if you list its files you will see the contents of the 
old user.dir directory.


> Reference to locator view file and/or its contents are not cleaned up 
> properly during cache close
> -------------------------------------------------------------------------------------------------
>
>                 Key: GEODE-4180
>                 URL: https://issues.apache.org/jira/browse/GEODE-4180
>             Project: Geode
>          Issue Type: Bug
>          Components: tests
>    Affects Versions: 1.4.0
>            Reporter: Kirk Lund
>
> We temporarily set the member-timeout to max value to allow us to step 
> through some code in the debugger. We noticed that if we run all 4 tests 
> together, tests 2-3 were hanging. 
> After removing the member-timeout setting, we found that all of the tests 
> after the 1st test are all trying to connect to the non-existent locator from 
> the 1st test. This causes all tests after the 1st test to take ~2 seconds 
> longer to run when run together than if you run them by individually.
> After digging a bit more, I discovered that even though the test is deleting 
> the entire directory containing the locator0view.dat file, some code 
> somewhere must still have an open connection or stream to it because its 
> contents from the 1st test continue to be read for each subsequent test even 
> after the file itself and its directory have been deleted.
> I believe some static code somewhere is keeping a reference to the file 
> and/or its contents. So each test continues to read the same content even 
> though the file no longer exists on disk.
> The following shows the relevant messages logged by 4 tests in a DUnit test 
> that shows tests 2-3 find and using the file and/or its contents from test 1. 
> Note that I used IntelliJ debugger to confirm that this occurs even after 
> test 1 deletes the file and its directory.
> 1) createsRegionMappingOnceOnly
> {noformat}
> [vm0] [info 2017/12/29 10:59:30.826 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] recovery file not found: 
> /var/folders/28/m__9dv1906n60kmz7t71wm680000gn/T/junit543979839291182624/vm-0-createsRegionMappingOnceOnly/locator0view.dat
> [vm0] [info 2017/12/29 10:59:31.135 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] received new view: View[192.168.1.18(58582:locator)<ec><v0>:32770|0] 
> members: [192.168.1.18(58582:locator)<ec><v0>:32770]
> [vm0] old view is: null
> {noformat}
> 2) createsRegionMappingWithMinimumParams
> {noformat}
> [vm0] [info 2017/12/29 10:59:34.580 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] Peer locator recovering from 
> /var/folders/28/m__9dv1906n60kmz7t71wm680000gn/T/junit1076413749574999935/vm-0-createsRegionMappingWithMinimumParams/locator0view.dat
> [vm0] [info 2017/12/29 10:59:34.580 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] Peer locator recovered membership is 
> View[192.168.1.18(58582:locator)<ec><v0>:32770|-1] members: 
> [192.168.1.18(58580)<v1>:32771{lead}]
> {noformat}
> 3) createsRegionMappingInService
> {noformat}
> [vm0] [info 2017/12/29 10:59:40.538 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] Peer locator recovering from 
> /var/folders/28/m__9dv1906n60kmz7t71wm680000gn/T/junit8253504123764665822/vm-0-createsRegionMappingInService/locator0view.dat
> [vm0] [info 2017/12/29 10:59:40.538 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] Peer locator recovered membership is 
> View[192.168.1.18(58582:locator)<ec><v0>:32770|-1] members: 
> [192.168.1.18(58580)<v1>:32771{lead}]
> {noformat}
> 4) recreatesCacheFromClusterConfigWithRegionMapping
> {noformat}
> [vm0] [info 2017/12/29 10:59:46.495 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] Peer locator recovering from 
> /var/folders/28/m__9dv1906n60kmz7t71wm680000gn/T/junit1056719983598139185/vm-0-recreatesCacheFromClusterConfigWithRegionMapping/locator0view.dat
> [vm0] [info 2017/12/29 10:59:46.496 PST <RMI TCP Connection(1)-192.168.1.18> 
> tid=20] Peer locator recovered membership is 
> View[192.168.1.18(58582:locator)<ec><v0>:32770|-1] members: 
> [192.168.1.18(58580)<v1>:32771{lead}]
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to