Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs

Piali Mazumder Nath Thu, 01 Dec 2016 14:34:19 -0800

Hi Val,



Thanks for pointing out the errors.

Since i was getting errors related to GC pauses, I had set offHeapMaxMemory
to 0 as per the below link.


http://apacheignite.gridgain.org/docs/performance-tips#tune-off-heap-memory

But now i have set it to -1



After doing the required changes and exposing the ports while creating the
container, i do not see those errors. Also i am able to ping the ignite
ports.

However the servers are still not joining. Getting the below errors on VM1
container which is started first. Initially the other node joins but again
it gets removed from the cluster.

[19:01:19,827][INFO][disco-event-worker-#44%null%][GridDiscoveryManager] *Added
new node to topology**:* TcpDiscoveryNode
[id=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9, addrs=[0:0:0:0:0:0:0:1%lo,
127.0.0.1, 172.17.0.1, 172.18.0.1, 172.20.29.33], sockAddrs=[/
172.20.29.33:47500, /172.17.0.1:47500, /0:0:0:0:0:0:0:1%lo:47500, /
127.0.0.1:47500, /172.18.0.1:47500], discPort=47500, order=14, intOrder=8,
lastExchangeTime=1480618879814, loc=false,
ver=1.7.0#20160801-sha1:383273e3, isClient=false]

[19:01:19,828][INFO][disco-event-worker-#44%null%][GridDiscoveryManager]
*Topology
snapshot [ver=15, servers=2, clients=0, CPUs=4, heap=2.0GB]*

[19:01:19,829][WARNING][disco-event-worker-#44%null%][GridDiscoveryManager*]
Node FAILED:* TcpDiscoveryNode [id=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9,
addrs=[0:0:0:0:0:0:0:1%lo, 127.0.0.1, 172.17.0.1, 172.18.0.1,
172.20.29.33], sockAddrs=[/172.20.29.33:47500, /172.17.0.1:47500,
/0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500, /172.18.0.1:47500],
discPort=47500, order=14, intOrder=8, lastExchangeTime=1480618879814,
loc=false, ver=1.7.0#20160801-sha1:383273e3, isClient=false]

[19:01:19,829][INFO][disco-event-worker-#44%null%][GridDiscoveryManager]
*Topology
snapshot [ver=15, servers=1, clients=0, CPUs=2, heap=1.0GB]*

[19:01:19,846][INFO][exchange-worker-#47%null%][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=14, minorTopVer=0], evt=NODE_JOINED,
node=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9]

[19:01:19,865][INFO][exchange-worker-#47%null%][GridCachePartitionExchangeManager]
Skipping rebalancing (nothing scheduled) [top=AffinityTopologyVersion
[topVer=15, minorTopVer=0], evt=NODE_FAILED,
node=13e5b190-fa8e-4c8c-be62-9f09ae6fadb9]



Getting the errors on VM2 container. Although in VM1 we get a message that
this container has joined but not joining message is there in its log

[18:59:27,919][WARNING][main][TcpDiscoverySpi] *Node has not been connected
to topology and will repeat join process.* Check remote nodes logs for
possible error messages. Note that large topology may require significant
time to start. Increase 'TcpDiscoverySpi.networkTimeout' configuration
property if getting this message on the starting nodes
[networkTimeout=20000]





Please find below my current configuration [Since the issue persisted, I
have also included the Non-loopback local IPs as well]

I have retained the loopback address else the node is not coming up.

    <bean id="grid.cfg"
class="org.apache.ignite.configuration.IgniteConfiguration">

      <property name="cacheConfiguration">

        <bean class="org.apache.ignite.configuration.CacheConfiguration">

          <property name="offHeapMaxMemory" value="-1"/>

        </bean>

      </property>



      <!-- Explicitly configure TCP discovery SPI to provide list of
initial nodes. -->

        <property name="discoverySpi">

            <bean
class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">

                <property name="localPort" value="47500"/>

                <property name="networkTimeout" value="20000" />

                <property name="ipFinder">

                    <bean
class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">

                        <property name="addresses">

                            <list>

                                <!-- In distributed environment, replace
with actual host IP address.> -->

                                <value>127.0.0.1:47500..47509</value>

                                <value>172.26.116.67:47500..47509</value>

                                <value>172.18.0.1:47500..47509</value>

                                <value>172.17.0.1:47500..47509</value>

                                <value>172.20.29.33:47500..47509</value>

                            </list>

                        </property>

                    </bean>

                </property>

                <property name="ackTimeout" value="50"/>

                <property name="socketTimeout" value="200"/>

                <property name="heartbeatFrequency" value="100"/>

            </bean>

        </property>

        <property name="communicationSpi">

          <bean
class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi">

             <!-- Override local port. -->

             <property name="localPort" value="47100"/>

             <property name="sharedMemoryPort" value="-1"/>

          </bean>

      </property>

    </bean>



Can you please help?

Do we need to use address resolver?


On Wed, Nov 30, 2016 at 5:12 PM, vkulichenko <valentin.kuliche...@gmail.com>
wrote:

> Hi,
>
> This configuration is incorrect:
>
> <value>127.0.0.1:47100..47509</value>
> <value>172.26.116.67:47100..47509</value>
>
> First of all, since you're building a distributed cluster, you should not
> use loopback here. Put real addresses that discovery component binds to.
> Second of all, 47100 is a port for communication, not for discovery, so
> ranges should be 47500..47509 instead.
>
> In addition, please check that you not only can ping containers from each
> other, but that you can telnet to Ignite ports after first node is started.
> Sometimes it happens with Docker, that you can explicitly open ports.
>
> And finally, setting offHeapMaxMemory to zero actually enables and makes
> off-heap storage unlimited. To disable it should be set to -1, but this is
> actually a default value, so you don't need to do this either.
>
> -Val
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Unable-to-create-cluster-of-Apache-
> Ignite-Server-Containers-running-on-individual-VMs-tp9287p9314.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 


*Thanks & Regards,Piali Mazumder Nath*
*+1 415 629 7019*

Re: Unable to create cluster of Apache Ignite Server Containers running on individual VMs

Reply via email to