[jira] [Commented] (GEODE-746) When starting a locator using --bind-address, gfsh prints incorrect connect message
[ https://issues.apache.org/jira/browse/GEODE-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15533405#comment-15533405 ] Jared Stewart commented on GEODE-746: - This issue was reopened for a problem not relating to {{--bind-address}}, which has already has a separate ticket opened: [GEODE-1548 | https://issues.apache.org/jira/browse/GEODE-1548]. I am closing this issue and reopening that one. > When starting a locator using --bind-address, gfsh prints incorrect connect > message > --- > > Key: GEODE-746 > URL: https://issues.apache.org/jira/browse/GEODE-746 > Project: Geode > Issue Type: Improvement > Components: gfsh >Reporter: Jens Deppe >Assignee: Kevin Duling > Fix For: 1.0.0-incubating.M3 > > > When starting my locator with {{gfsh start locator --name=locator1 > --port=19991 --bind-address=192.168.103.1}}, the output from gfsh looks like > this: > {noformat} > .. > Locator in /Users/jdeppe/debug/locator1 on 192.168.103.1[19991] as locator1 > is currently online. > Process ID: 2666 > Uptime: 15 seconds > GemFire Version: 8.2.0.Beta > Java Version: 1.7.0_72 > Log File: /Users/jdeppe/debug/locator1/locator1.log > JVM Arguments: -Dgemfire.enable-cluster-configuration=true > -Dgemfire.load-cluster-configuration-from-dir=false > -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > Class-Path: > /Users/jdeppe/gemfire/82/lib/gemfire.jar:/Users/jdeppe/gemfire/82/lib/locator-dependencies.jar > Please use "connect --locator=192.168.1.10[19991]" to connect Gfsh to the > locator. > Failed to connect; unknown cause: Connection refused > {noformat} > The connect string shown is just displaying my host address and not the bind > address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-746) When starting a locator using --bind-address, gfsh prints incorrect connect message
[ https://issues.apache.org/jira/browse/GEODE-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15531158#comment-15531158 ] Jared Stewart commented on GEODE-746: - To replicate this bug: {code} ssh -i /Users/jstewart/Downloads/management-and-monitoring.cer ubuntu@52.89.135.89 gfsh>start locator --name=l1 {code} Now on your local machine: {code} gfsh>connect --locator=52.89.135.89[10334] Connecting to Locator at [host=52.89.135.89, port=10334] .. Connecting to Manager at [host=172.31.19.71, port=1099] .. Could not connect to : [host=172.31.19.71, port=1099]. Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 172.31.19.71; nested exception is: java.net.ConnectException: Operation timed out] {code} Swap thinks that the following command ought to solve this problem, but right now when run on the AWS instance it simply hangs: {code} gfsh>start locator --name=l1 --J=-Djava.rmi.server.hostname=52.89.135.89 {code} > When starting a locator using --bind-address, gfsh prints incorrect connect > message > --- > > Key: GEODE-746 > URL: https://issues.apache.org/jira/browse/GEODE-746 > Project: Geode > Issue Type: Improvement > Components: gfsh >Reporter: Jens Deppe >Assignee: Kevin Duling > > When starting my locator with {{gfsh start locator --name=locator1 > --port=19991 --bind-address=192.168.103.1}}, the output from gfsh looks like > this: > {noformat} > .. > Locator in /Users/jdeppe/debug/locator1 on 192.168.103.1[19991] as locator1 > is currently online. > Process ID: 2666 > Uptime: 15 seconds > GemFire Version: 8.2.0.Beta > Java Version: 1.7.0_72 > Log File: /Users/jdeppe/debug/locator1/locator1.log > JVM Arguments: -Dgemfire.enable-cluster-configuration=true > -Dgemfire.load-cluster-configuration-from-dir=false > -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > Class-Path: > /Users/jdeppe/gemfire/82/lib/gemfire.jar:/Users/jdeppe/gemfire/82/lib/locator-dependencies.jar > Please use "connect --locator=192.168.1.10[19991]" to connect Gfsh to the > locator. > Failed to connect; unknown cause: Connection refused > {noformat} > The connect string shown is just displaying my host address and not the bind > address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-746) When starting a locator using --bind-address, gfsh prints incorrect connect message
[ https://issues.apache.org/jira/browse/GEODE-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383104#comment-15383104 ] ASF subversion and git services commented on GEODE-746: --- Commit 3473229fc3fb1337fb3f85d419f8bddb04a2e9b3 in incubator-geode's branch refs/heads/develop from [~kduling] [ https://git-wip-us.apache.org/repos/asf?p=incubator-geode.git;h=3473229 ] GEODE-746: When starting a locator using --bind-address, gfsh prints incorrect connect message * This closes #208 > When starting a locator using --bind-address, gfsh prints incorrect connect > message > --- > > Key: GEODE-746 > URL: https://issues.apache.org/jira/browse/GEODE-746 > Project: Geode > Issue Type: Improvement > Components: gfsh >Reporter: Jens Deppe >Assignee: Kevin Duling > > When starting my locator with {{gfsh start locator --name=locator1 > --port=19991 --bind-address=192.168.103.1}}, the output from gfsh looks like > this: > {noformat} > .. > Locator in /Users/jdeppe/debug/locator1 on 192.168.103.1[19991] as locator1 > is currently online. > Process ID: 2666 > Uptime: 15 seconds > GemFire Version: 8.2.0.Beta > Java Version: 1.7.0_72 > Log File: /Users/jdeppe/debug/locator1/locator1.log > JVM Arguments: -Dgemfire.enable-cluster-configuration=true > -Dgemfire.load-cluster-configuration-from-dir=false > -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > Class-Path: > /Users/jdeppe/gemfire/82/lib/gemfire.jar:/Users/jdeppe/gemfire/82/lib/locator-dependencies.jar > Please use "connect --locator=192.168.1.10[19991]" to connect Gfsh to the > locator. > Failed to connect; unknown cause: Connection refused > {noformat} > The connect string shown is just displaying my host address and not the bind > address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-746) When starting a locator using --bind-address, gfsh prints incorrect connect message
[ https://issues.apache.org/jira/browse/GEODE-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15383106#comment-15383106 ] ASF GitHub Bot commented on GEODE-746: -- Github user asfgit closed the pull request at: https://github.com/apache/incubator-geode/pull/208 > When starting a locator using --bind-address, gfsh prints incorrect connect > message > --- > > Key: GEODE-746 > URL: https://issues.apache.org/jira/browse/GEODE-746 > Project: Geode > Issue Type: Improvement > Components: gfsh >Reporter: Jens Deppe >Assignee: Kevin Duling > > When starting my locator with {{gfsh start locator --name=locator1 > --port=19991 --bind-address=192.168.103.1}}, the output from gfsh looks like > this: > {noformat} > .. > Locator in /Users/jdeppe/debug/locator1 on 192.168.103.1[19991] as locator1 > is currently online. > Process ID: 2666 > Uptime: 15 seconds > GemFire Version: 8.2.0.Beta > Java Version: 1.7.0_72 > Log File: /Users/jdeppe/debug/locator1/locator1.log > JVM Arguments: -Dgemfire.enable-cluster-configuration=true > -Dgemfire.load-cluster-configuration-from-dir=false > -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > Class-Path: > /Users/jdeppe/gemfire/82/lib/gemfire.jar:/Users/jdeppe/gemfire/82/lib/locator-dependencies.jar > Please use "connect --locator=192.168.1.10[19991]" to connect Gfsh to the > locator. > Failed to connect; unknown cause: Connection refused > {noformat} > The connect string shown is just displaying my host address and not the bind > address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-746) When starting a locator using --bind-address, gfsh prints incorrect connect message
[ https://issues.apache.org/jira/browse/GEODE-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15377423#comment-15377423 ] Kevin Duling commented on GEODE-746: Grace and I tracked the first part of this down to a problem in {{LauncherLifecycleCommands}}: {{String locatorHostName = StringUtils.defaultIfBlank(locatorLauncher.getHostnameForClients(), getLocalHost());}} We've changed this to look instead at the bind address first: {code} String locatorHostName; InetAddress bindAddr = locatorLauncher.getBindAddress(); if (bindAddr != null){ locatorHostName = bindAddr.getCanonicalHostName(); } else { locatorHostName = StringUtils.defaultIfBlank(locatorLauncher.getHostnameForClients(), getLocalHost()); } {code} This improved things a little. The system will now connect: {{gfsh start locator --name=locator1 --port=19991 --bind-address=192.168.1.187}} {noformat} Listening for transport dt_socket at address: 3 ... Locator in /gemfire/open/locator1 on 192.168.1.187[19991] as locator1 is currently online. Process ID: 2765 Uptime: 1 minute 23 seconds GemFire Version: 1.0.0-incubating-SNAPSHOT Java Version: 1.8.0_92 Log File: /gemfire/open/locator1/locator1.log JVM Arguments: -Dgemfire.enable-cluster-configuration=true -Dgemfire.load-cluster-configuration-from-dir=false -agentlib:jdwp=transport=dt_socket,server=y,suspend=y,address=2 -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 Class-Path: /gemfire/open/geode-assembly/build/install/apache-geode/lib/geode-core-1.0.0-incubating-SNAPSHOT.jar:/gemfire/open/geode-assembly/build/install/apache-geode/lib/geode-dependencies.jar Successfully connected to: [host=pdx2-office-dhcp9.eng.vmware.com, port=1099] Cluster configuration service is up and running. {noformat} But now the successfully connected message is showing the wrong IP address. Looking at netstat, we can see that the listener is correctly bound to the IP address specified: {noformat} $ netstat -an | grep 19991 tcp4 0 0 192.168.1.187.19991*.*LISTEN {noformat} Yet the hostname actually resolves to a different NIC: {{ping pdx2-office-dhcp9.eng.vmware.com}} {noformat} PING pdx2-office-dhcp9.eng.vmware.com (10.118.33.209): 56 data bytes {noformat} Both NICs exist on this machine, just one is being erroneously reported: {{nestat -rn}} {noformat} Routing tables Internet: DestinationGatewayFlagsRefs Use Netif Expire default10.118.33.253 UGSc 3600 en4 default192.168.1.253 UGScI 350 en0 {noformat} Tracing this down, it appears to be an incorrect response from the locator in {{ShellCommands.connectToLocator(String host, int port, int timeout, Mapprops)}} {code} JmxManagerLocatorResponse locatorResponse = JmxManagerLocatorRequest.send(host, port, timeout, props); // locatorResponse: “JmxManagerLocatorResponse [host=10.118.33.209, port=1099, ssl=false, ex=null]” // host: “192.168.1.187” // port: 19991 // timeout: 15000 // props: size = 0 {code} > When starting a locator using --bind-address, gfsh prints incorrect connect > message > --- > > Key: GEODE-746 > URL: https://issues.apache.org/jira/browse/GEODE-746 > Project: Geode > Issue Type: Improvement > Components: gfsh >Reporter: Jens Deppe >Assignee: Kevin Duling > > When starting my locator with {{gfsh start locator --name=locator1 > --port=19991 --bind-address=192.168.103.1}}, the output from gfsh looks like > this: > {noformat} > .. > Locator in /Users/jdeppe/debug/locator1 on 192.168.103.1[19991] as locator1 > is currently online. > Process ID: 2666 > Uptime: 15 seconds > GemFire Version: 8.2.0.Beta > Java Version: 1.7.0_72 > Log File: /Users/jdeppe/debug/locator1/locator1.log > JVM Arguments: -Dgemfire.enable-cluster-configuration=true > -Dgemfire.load-cluster-configuration-from-dir=false > -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > Class-Path: > /Users/jdeppe/gemfire/82/lib/gemfire.jar:/Users/jdeppe/gemfire/82/lib/locator-dependencies.jar > Please use "connect --locator=192.168.1.10[19991]" to connect Gfsh to the > locator. > Failed to connect; unknown cause: Connection refused > {noformat} > The connect string shown is just displaying my host address and not the bind > address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (GEODE-746) When starting a locator using --bind-address, gfsh prints incorrect connect message
[ https://issues.apache.org/jira/browse/GEODE-746?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15110014#comment-15110014 ] Jens Deppe commented on GEODE-746: -- Actually, there are two things wrong here. The first is simply the incorrect text being displayed. The second is the message stating {{Failed to connect; unknown cause: Connection refused}}. I believe what's happening is that the locator is connecting back to itself to check it's status. The problem becomes more evident when we bind the jmx manager to an address. At that point, gfsh will not be able to successfully connect to the JMX manager and will get an error like: {noformat} Connecting to Locator at [host=localhost, port=19991] .. Connecting to Manager at [host=localhost, port=1099] .. Connection refused to host: 192.168.1.28; nested exception is: java.net.ConnectException: Connection refused {noformat} This can be resolved by setting {{java.rmi.server.hostname=127.0.0.1}} on the locator. We should consider setting {{java.rmi.server.hostname}} when using {{jmx-manager-bind-address}} (or just {{bind-address}}). > When starting a locator using --bind-address, gfsh prints incorrect connect > message > --- > > Key: GEODE-746 > URL: https://issues.apache.org/jira/browse/GEODE-746 > Project: Geode > Issue Type: Improvement > Components: management >Reporter: Jens Deppe > > When starting my locator with {{gfsh start locator --name=locator1 > --port=19991 --bind-address=192.168.103.1}}, the output from gfsh looks like > this: > {noformat} > .. > Locator in /Users/jdeppe/debug/locator1 on 192.168.103.1[19991] as locator1 > is currently online. > Process ID: 2666 > Uptime: 15 seconds > GemFire Version: 8.2.0.Beta > Java Version: 1.7.0_72 > Log File: /Users/jdeppe/debug/locator1/locator1.log > JVM Arguments: -Dgemfire.enable-cluster-configuration=true > -Dgemfire.load-cluster-configuration-from-dir=false > -Dgemfire.launcher.registerSignalHandlers=true -Djava.awt.headless=true > -Dsun.rmi.dgc.server.gcInterval=9223372036854775806 > Class-Path: > /Users/jdeppe/gemfire/82/lib/gemfire.jar:/Users/jdeppe/gemfire/82/lib/locator-dependencies.jar > Please use "connect --locator=192.168.1.10[19991]" to connect Gfsh to the > locator. > Failed to connect; unknown cause: Connection refused > {noformat} > The connect string shown is just displaying my host address and not the bind > address. -- This message was sent by Atlassian JIRA (v6.3.4#6332)