Hello! I can see the following in the thread dump: "main" #1 prio=5 os_prio=0 tid=0x00007f02c400d800 nid=0x1e43 runnable [0x00007f02cad1e000] java.lang.Thread.State: RUNNABLE at sun.nio.ch.Net.poll(Native Method) at sun.nio.ch.SocketChannelImpl.poll(SocketChannelImpl.java:951) - locked <0x00000000ec066048> (a java.lang.Object) at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:121) - locked <0x00000000ec066038> (a java.lang.Object) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createTcpClient(TcpCommunicationSpi.java:3299) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.createNioClient(TcpCommunicationSpi.java:2987) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:2870) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:2713) at org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2672) at org.apache.ignite.internal.managers.communication.GridIoManager.send(GridIoManager.java:1656) at org.apache.ignite.internal.managers.communication.GridIoManager.sendToGridTopic(GridIoManager.java:1731) at org.apache.ignite.internal.processors.task.GridTaskWorker.sendRequest(GridTaskWorker.java:1436) at org.apache.ignite.internal.processors.task.GridTaskWorker.processMappedJobs(GridTaskWorker.java:666) at org.apache.ignite.internal.processors.task.GridTaskWorker.body(GridTaskWorker.java:538) at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:120) at org.apache.ignite.internal.processors.task.GridTaskProcessor.startTask(GridTaskProcessor.java:764) at org.apache.ignite.internal.processors.task.GridTaskProcessor.execute(GridTaskProcessor.java:392) at org.apache.ignite.internal.IgniteComputeImpl.executeAsync0(IgniteComputeImpl.java:528) at org.apache.ignite.internal.IgniteComputeImpl.execute(IgniteComputeImpl.java:498) at org.apache.ignite.visor.visor$.execute(visor.scala:1800)
It seems that Visor is trying to connect to client node via Communication, and it fails because the network connection is filtered out. Regards, -- Ilya Kasnacheev пн, 29 июн. 2020 г. в 23:47, John Smith <java.dev....@gmail.com>: > Ok. > > I am able to reproduce the "issue" unless we have a misunderstanding and > we are talking about the same thing... > > My thick client runs inside a container in a closed network NOT bridged > and NOT host. I added a flag to my application that allows it to add the > address resolver to the config. > > 1- If I disable address resolution and I connect with visor to the cluster > and try to print detailed statistics for that particular client, visor > freezes indefinitely at the Data Region Snapshot. > Control C doesn't kill the visor either. It just stuck. This also happens > when running the cache command. Just freezes indefinitely. > > I attached the jstack output to the email but it is also here: > https://www.dropbox.com/s/wujcee1gd87gk6o/jstack.out?dl=0 > > 2- If I enable address resolution for the thick client then all the > commands work ok. I also see an "Accepted incoming communication > connection" log in the client. > > > > > > > > > > On Mon, 29 Jun 2020 at 15:30, Ilya Kasnacheev <ilya.kasnach...@gmail.com> > wrote: > >> Hello! >> >> The easiest way is jstack <process id of visor> >> >> Regards, >> -- >> Ilya Kasnacheev >> >> >> пн, 29 июн. 2020 г. в 20:20, John Smith <java.dev....@gmail.com>: >> >>> How? >>> >>> On Mon, 29 Jun 2020 at 12:03, Ilya Kasnacheev <ilya.kasnach...@gmail.com> >>> wrote: >>> >>>> Hello! >>>> >>>> Try collecting thread dump from Visor as it freezes. >>>> >>>> Regards, >>>> -- >>>> Ilya Kasnacheev >>>> >>>> >>>> пн, 29 июн. 2020 г. в 18:11, John Smith <java.dev....@gmail.com>: >>>> >>>>> How though? >>>>> >>>>> 1- Entered node command >>>>> 2- Got list of nodes, including thick clients >>>>> 3- Selected thick client >>>>> 4- Entered Y for detailed statistics >>>>> 5- Snapshot details displayed >>>>> 6- Data region stats frozen >>>>> >>>>> I think the address resolution is working for this as well. I need to >>>>> confirm. Because I fixed the resolver as per your solution and visor no >>>>> longer freezes on #6 above. >>>>> >>>>> On Mon, 29 Jun 2020 at 10:54, Ilya Kasnacheev < >>>>> ilya.kasnach...@gmail.com> wrote: >>>>> >>>>>> Hello! >>>>>> >>>>>> This usually means there's no connectivity between node and Visor. >>>>>> >>>>>> Regards, >>>>>> -- >>>>>> Ilya Kasnacheev >>>>>> >>>>>> >>>>>> пн, 29 июн. 2020 г. в 17:01, John Smith <java.dev....@gmail.com>: >>>>>> >>>>>>> Also I think for Visor as well? >>>>>>> >>>>>>> When I do top or node commands, I can see the thick client. But when >>>>>>> I look at detailed statistics for that particular thick client it >>>>>>> freezes >>>>>>> "indefinitely". Regular statistics it seems ok. >>>>>>> >>>>>>> On Mon, 29 Jun 2020 at 08:08, Ilya Kasnacheev < >>>>>>> ilya.kasnach...@gmail.com> wrote: >>>>>>> >>>>>>>> Hello! >>>>>>>> >>>>>>>> For thick clients, you need both 47100 and 47500, both directions >>>>>>>> (perhaps for 47500 only client -> server is sufficient, but for 47100, >>>>>>>> both >>>>>>>> are needed). >>>>>>>> >>>>>>>> For thin clients, 10800 is enough. For control.sh, 11211. >>>>>>>> >>>>>>>> Regards, >>>>>>>> -- >>>>>>>> Ilya Kasnacheev >>>>>>>> >>>>>>>> >>>>>>>> пт, 26 июн. 2020 г. в 22:06, John Smith <java.dev....@gmail.com>: >>>>>>>> >>>>>>>>> I'm askin in separate question so people can search for it if they >>>>>>>>> ever come across this... >>>>>>>>> >>>>>>>>> My server nodes are started as and I also connect the client as >>>>>>>>> such. >>>>>>>>> >>>>>>>>> <bean >>>>>>>>> class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder"> >>>>>>>>> <property name="addresses"> >>>>>>>>> <list> >>>>>>>>> <value>foo:47500</value> >>>>>>>>> ... >>>>>>>>> </list> >>>>>>>>> </property> >>>>>>>>> </bean> >>>>>>>>> >>>>>>>>> In my client code I used the basic address resolver >>>>>>>>> >>>>>>>>> And I put in the map >>>>>>>>> >>>>>>>>> "{internalHostIP}:47500", "{externalHostIp}:{externalPort}" >>>>>>>>> >>>>>>>>> igniteConfig.setAddressResolver(addrResolver); >>>>>>>>> >>>>>>>>> >>>>>>>>> QUESTIONS >>>>>>>>> ___________________ >>>>>>>>> >>>>>>>>> 1- Port 47500 is used for discovery only? >>>>>>>>> 2- Port 47100 is used for actual coms to the nodes? >>>>>>>>> 3- In my container environment I have only mapped 47100, do I also >>>>>>>>> need to map for 47500 for the Tcp Discovery SPI? >>>>>>>>> 4- When I connect with Visor and I try to look at details for the >>>>>>>>> client node it blocks. I'm assuming that's because visor cannot >>>>>>>>> connect >>>>>>>>> back to the client at 47100? >>>>>>>>> Se logs below >>>>>>>>> >>>>>>>>> LOGS >>>>>>>>> ___________________ >>>>>>>>> >>>>>>>>> When I look at the client logs I get... >>>>>>>>> >>>>>>>>> IgniteConfiguration [ >>>>>>>>> igniteInstanceName=xxxxxx, >>>>>>>>> ... >>>>>>>>> discoSpi=TcpDiscoverySpi [ >>>>>>>>> addrRslvr=null, <--- Do I need to use BasicResolver or here??? >>>>>>>>> ... >>>>>>>>> commSpi=TcpCommunicationSpi [ >>>>>>>>> ... >>>>>>>>> locAddr=null, >>>>>>>>> locHost=null, >>>>>>>>> locPort=47100, >>>>>>>>> addrRslvr=null, <--- Do I need to use BasicResolver or here??? >>>>>>>>> ... >>>>>>>>> ], >>>>>>>>> ... >>>>>>>>> addrRslvr=BasicAddressResolver [ >>>>>>>>> inetAddrMap={}, >>>>>>>>> inetSockAddrMap={/internalIp:47100=/externalIp:2389} <---- >>>>>>>>> ], >>>>>>>>> ... >>>>>>>>> clientMode=true, >>>>>>>>> ... >>>>>>>>> >>>>>>>>> >>>>>>>>>