Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs
Hi, You can take a look at the JavaDoc of BasicAddressResolver [1]. This is the implementation provided out of the box. [1] https://github.com/apache/ignite/blob/master/modules/core/src/main/java/org/apache/ignite/configuration/BasicAddressResolver.java -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Unable-to-create-cluster-of-Apache-Ignite-Server-Containers-running-on-individual-VMs-tp9287p9377.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs
Hi Val, Is there any example code to refer the usage of Address Resolver in xml file? I tried searching but couldn't find. Can you please help? On Thu, Dec 1, 2016 at 5:09 PM, vkulichenkowrote: > Hi, > > Address resolver is needed if there are private and public addresses on the > node and it can be accessed only through the public one. I'm not sure if > that's your case, but it really looks like some ports are still can't be > accessed for some reason. Note that there is not only discovery port (47500 > by default), but also communication port (47100 by default). Both of them > should be accessible from the outside on both nodes. > > -Val > > > > -- > View this message in context: http://apache-ignite-users. > 70518.x6.nabble.com/Unable-to-create-cluster-of-Apache- > Ignite-Server-Containers-running-on-individual-VMs-tp9287p9351.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. > -- *Thanks & Regards,Piali Mazumder Nath* *+1 415 629 7019*
Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs
Hi Val, Thanks for pointing out the errors. Since i was getting errors related to GC pauses, I had set offHeapMaxMemory to 0 as per the below link. http://apacheignite.gridgain.org/docs/performance-tips#tune-off-heap-memory But now i have set it to -1 After doing the required changes and exposing the ports while creating the container, i do not see those errors. Also i am able to ping the ignite ports. However the servers are still not joining. Getting the below errors on VM1 container which is started first. Initially the other node joins but again it gets removed from the cluster. [19:01:19,827][INFO][disco-event-worker-#44%null%][GridDiscoveryManager] *Added new node to topology**:* TcpDiscoveryNode [id=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.18.0.1, 172.20.29.33], sockAddrs=[/ 172.20.29.33:47500, /172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, / 127.0.0.1:47500, /172.18.0.1:47500], discPort=47500, order=14, intOrder=8, lastExchangeTime=1480618879814, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false] [19:01:19,828][INFO][disco-event-worker-#44%null%][GridDiscoveryManager] *Topology snapshot [ver=15, servers=2, clients=0, CPUs=4, heap=2.0GB]* [19:01:19,829][WARNING][disco-event-worker-#44%null%][GridDiscoveryManager*] Node FAILED:* TcpDiscoveryNode [id=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9, addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.18.0.1, 172.20.29.33], sockAddrs=[/172.20.29.33:47500, /172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /172.18.0.1:47500], discPort=47500, order=14, intOrder=8, lastExchangeTime=1480618879814, loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false] [19:01:19,829][INFO][disco-event-worker-#44%null%][GridDiscoveryManager] *Topology snapshot [ver=15, servers=1, clients=0, CPUs=2, heap=1.0GB]* [19:01:19,846][INFO][exchange-worker-#47%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=14, minorTopVer=0], evt=NODE_JOINED, node=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9] [19:01:19,865][INFO][exchange-worker-#47%null%][GridCachePartitionExchangeManager] Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion [topVer=15, minorTopVer=0], evt=NODE_FAILED, node=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9] Getting the errors on VM2 container. Although in VM1 we get a message that this container has joined but not joining message is there in its log [18:59:27,919][WARNING][main][TcpDiscoverySpi] *Node has not been connected to topology and will repeat join process.* Check remote nodes logs for possible error messages. Note that large topology may require significant time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration property if getting this message on the starting nodes [networkTimeout=2] Please find below my current configuration [Since the issue persisted, I have also included the Non-loopback local IPs as well] I have retained the loopback address else the node is not coming up. 127.0.0.1:47500..47509 172.26.116.67:47500..47509 172.18.0.1:47500..47509 172.17.0.1:47500..47509 172.20.29.33:47500..47509 Can you please help? Do we need to use address resolver? On Wed, Nov 30, 2016 at 5:12 PM, vkulichenkowrote: > Hi, > > This configuration is incorrect: > > 127.0.0.1:47100..47509 > 172.26.116.67:47100..47509 > > First of all, since you're building a distributed cluster, you should not > use loopback here. Put real addresses that discovery component binds to. > Second of all, 47100 is a port for communication, not for discovery, so > ranges should be 47500..47509 instead. > > In addition, please check that you not only can ping containers from each > other, but that you can telnet to Ignite ports after first node is started. > Sometimes it happens with Docker, that you can explicitly open ports. > > And finally, setting offHeapMaxMemory to zero actually enables and makes > off-heap storage unlimited. To disable it should be set to -1, but this is > actually a default value, so you don't need to do this either. > > -Val > > > > -- > View this message in context: http://apache-ignite-users. > 70518.x6.nabble.com/Unable-to-create-cluster-of-Apache- >
Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs
Hi, This configuration is incorrect: 127.0.0.1:47100..47509 172.26.116.67:47100..47509 First of all, since you're building a distributed cluster, you should not use loopback here. Put real addresses that discovery component binds to. Second of all, 47100 is a port for communication, not for discovery, so ranges should be 47500..47509 instead. In addition, please check that you not only can ping containers from each other, but that you can telnet to Ignite ports after first node is started. Sometimes it happens with Docker, that you can explicitly open ports. And finally, setting offHeapMaxMemory to zero actually enables and makes off-heap storage unlimited. To disable it should be set to -1, but this is actually a default value, so you don't need to do this either. -Val -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Unable-to-create-cluster-of-Apache-Ignite-Server-Containers-running-on-individual-VMs-tp9287p9314.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs
Hi Pawantlor, Thanks for your reply. I did my testing using Since its didn't work i tried using cloud.TcpDiscoveryVmIpFinder. In both the cases I am getting the same error. Forgot to correct it before posting. Is your containers running on Virtual Machine? On Wed, Nov 30, 2016 at 5:44 AM, pawantlorwrote: > I feel you should use "vm.TcpDiscoveryVmIpFinder" instead of cloud. > What the error after you use "vm.TcpDiscoveryVmIpFinder"? > > I am using container based setup using zookeeper based discovery and its > working fine. > > > > -- > View this message in context: http://apache-ignite-users. > 70518.x6.nabble.com/Unable-to-create-cluster-of-Apache- > Ignite-Server-Containers-running-on-individual-VMs-tp9287p9296.html > Sent from the Apache Ignite Users mailing list archive at Nabble.com. > -- *Thanks & Regards,Piali Mazumder Nath* *+1 415 629 7019*
Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs
I feel you should use "vm.TcpDiscoveryVmIpFinder" instead of cloud. What the error after you use "vm.TcpDiscoveryVmIpFinder"? I am using container based setup using zookeeper based discovery and its working fine. -- View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Unable-to-create-cluster-of-Apache-Ignite-Server-Containers-running-on-individual-VMs-tp9287p9296.html Sent from the Apache Ignite Users mailing list archive at Nabble.com.
Unable to create cluster of Apache Ignite Server Containers running on individual VMs
Hi, I am trying to create a cluster of apache ignite server containers but unable to bring it up. *Setup:* To start, I have created two VMs on two separate host machines and trying to launch one Apache Ignite server container (Docker) on each VMs . The VMs are accessible using floating IP (e.g., VM1-172.26.116.67, VM2-172.26.116.150) and containers are using host networking. The containers are also pinging each other. *Testing: * I am using the $IGNITE_HOME/bin/ignite.sh, but have changed the default configuration to enable discovery. ** ** *127.0.0.1:47100..47509* *172.26.116.67:47100..47509* ** *Issue:* When I start the 1st Apache Ignite server container on VM1, I see warnings related to remote node GC pauses even though I tuned off heap memory (), and also, no remote node is running. (verified through visor) [22:55:24,132][INFO][main][TcpCommunicationSpi] Successfully bound to TCP port [port=47100, locHost=0.0.0.0/0.0.0.0] [22:55:24,854][INFO][main][TcpDiscoverySpi] Successfully bound to TCP port [port=47500, localHost=0.0.0.0/0.0.0.0] *[22:55:26,379][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=50]* *[22:55:26,482][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=100]* *[22:55:26,684][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=200]* *[22:55:27,086][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=400]* *[22:55:27,888][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=800]* *[22:55:29,491][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=1600]* *[22:55:32,696][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=3200]* *[22:55:39,098][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=6400]* *[22:55:51,904][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=12800]* *[22:56:17,523][WARNING][main][TcpDiscoverySpi] Timed out waiting for message to be read (most probably, the reason is in long GC pauses on remote node) [curTimeout=25600]* [22:56:18,033][WARNING][main][GridCacheProcessor] *Eviction policy not enabled with ONHEAP_TIERED mode for cache* (entries will not be moved to off-heap store): default [22:56:18,120][SEVERE][grid-nio-worker-1-#38%null%][GridDirectParser] *Failed to read message* [msg=null, buf=java.nio.DirectByteBuffer[pos=5 lim=420 cap=32768], reader=null, ses=GridSelectorNioSessionImpl [sele ctorIdx=1, queueSize=1, writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=5 lim=420 cap=32768], recovery=null, super=GridNioSessionImpl [locAddr=/127.0.0.1:47 100, rmtAddr=/127.0.0.1:48819, createTime=1480460126370, closeTime=0, bytesSent=0, bytesRcvd=420, sndSchedTime=1480460178019, lastSndTime=1480460178114, lastRcvTime=1480460178114, readsPaused=false, filterChai n=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser@330f5ec2, directMode=true], GridConnectionBytesVerifyFilter], accepted=true]]] … [22:56:18,127][WARNING][grid-nio-worker-1-#38%null%][TcpCommunicationSpi] *Failed to process selector key (will close):* GridSelectorNioSessionImpl [selectorIdx=1, queueSize=1, writeBuf=java.nio.DirectByteBuffer [pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=5 lim=420 cap=32768], recovery=null, super=GridNioSessionImpl [locAddr=/ 127.0.0.1:47100, rmtAddr=/127.0.0.1:48819, createTime=1480460126370, c loseTime=0, bytesSent=0, bytesRcvd=420, sndSchedTime=1480460178019, lastSndTime=1480460178114, lastRcvTime=1480460178114, readsPaused=false,