[ 
https://issues.apache.org/jira/browse/IGNITE-11842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16840101#comment-16840101
 ] 

Aquilino Viveiros edited comment on IGNITE-11842 at 5/15/19 6:56 AM:
---------------------------------------------------------------------

There is definitely connectivity problems when using Ignite 2.7 (might affect 
other versions) on Kubernetes. Here are a few details:

*Kubernetes version*
{code:java}
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", 
GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", 
BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", 
Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", 
GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", 
BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", 
Platform:"linux/amd64"}
{code}
 

*Java Microservices (Clients, containers on K8s)*
{code:java}
// containers base image (java jre with musl)
java:8-jre-alpine {code}
{code:java}
/ # java -version
openjdk version "1.8.0_111-internal"
OpenJDK Runtime Environment (build 1.8.0_111-internal-alpine-r0-b14)
OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)
{code}
{code:java}
/ # ldd /usr/bin/java /lib/ld-musl-x86_64.so.1 (0x7f77e34f9000) Error loading 
shared library libjli.so: No such file or directory (needed by /usr/bin/java) 
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f77e34f9000) Error 
relocating /usr/bin/java: JLI_Launch: symbol not found
{code}
 

*Official Docker Image Ignite 2.7 (Server, containers on K8s)*
{code:java}
/opt/ignite # java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (IcedTea 3.9.0) (Alpine 8.181.13-r0)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
{code}
{code:java}
/opt/ignite # ldd /usr/bin/java /lib/ld-musl-x86_64.so.1 (0x7f80fe498000) Error 
loading shared library libjli.so: No such file or directory (needed by 
/usr/bin/java) libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 
(0x7f80fe498000) Error relocating /usr/bin/java: JLI_Launch: symbol not found
{code}
With the above setup, the first 2-4 clients would connect fine, but after day 
they more clients we add, the clients would start to struggled to connect to 
the server. . Once we reach to around 10 clients connected to the server. Visor 
would work fine up to 2-4 clients, after that, visor would not connect at all.

 

After a bit of reading we identified the problem might have been Alpine with 
musl. We changed all our Microservices to use another base image, Alpine with 
Libc.

*Java Microservices (Clients, containers on K8s)*
{code:java}
// containers base image (java jre with libc)
adoptopenjdk/openjdk8:alpine-slim{code}
{code:java}
/ # java -version openjdk version "1.8.0_212" OpenJDK Runtime Environment 
(AdoptOpenJDK)(build 1.8.0_212-b03) OpenJDK 64-Bit Server VM 
(AdoptOpenJDK)(build 25.212-b03, mixed mode)
{code}
{code:java}
/ # ldd /opt/java/openjdk/bin/java
/lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
libjli.so => /opt/java/openjdk/bin/../lib/amd64/jli/libjli.so (0x7f4efbe46000)
libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
Error relocating /opt/java/openjdk/bin/../lib/amd64/jli/libjli.so: __rawmemchr: 
symbol not found
{code}
With the above, clients (~10 clients) would now connect to the server. But we 
here, visor would then again struggle to connect. 

We went a bit further an build a custom Ignite 2.7 Docker Image using 
adoptopenjdk/openjdk8:alpine-slim. Using 
[https://github.com/apache/ignite/tree/2.7.0/docker/apache-ignite] as the setup.

 

*Custom Docker Image Ignite 2.7 (Server, containers on K8s)*
{code:java}
/opt/ignite # java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)
{code}
{code:java}
 /opt/ignite # ldd /opt/java/openjdk/bin/java /lib64/ld-linux-x86-64.so.2 
(0x7fadc3193000) libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 
(0x7fadc3193000) libjli.so => /opt/java/openjdk/bin/../lib/amd64/jli/libjli.so 
(0x7fadc2f7c000) libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7fadc3193000) 
libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7fadc3193000) Error relocating 
/opt/java/openjdk/bin/../lib/amd64/jli/libjli.so: __rawmemchr: symbol not found
{code}
We this change, clients, server and visor showed no network/connectivity 
problems (scaled to 10 clients).


was (Author: aveiros):
There is definitely connectivity problems when using Ignite 2.7 (might affect 
other versions) on Kubernetes. Here are a few details:

*Kubernetes version*

 
{code:java}
Client Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", 
GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", 
BuildDate:"2018-08-07T23:17:28Z", GoVersion:"go1.10.3", Compiler:"gc", 
Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.2", 
GitCommit:"bb9ffb1654d4a729bb4cec18ff088eacc153c239", GitTreeState:"clean", 
BuildDate:"2018-08-07T23:08:19Z", GoVersion:"go1.10.3", Compiler:"gc", 
Platform:"linux/amd64"}
{code}
 

*Java Microservices (Clients, containers on K8s)*

 
{code:java}
// containers base image (java jre with musl)
java:8-jre-alpine {code}
 
{code:java}
/ # java -version
openjdk version "1.8.0_111-internal"
OpenJDK Runtime Environment (build 1.8.0_111-internal-alpine-r0-b14)
OpenJDK 64-Bit Server VM (build 25.111-b14, mixed mode)
{code}
{code:java}
/ # ldd /usr/bin/java /lib/ld-musl-x86_64.so.1 (0x7f77e34f9000) Error loading 
shared library libjli.so: No such file or directory (needed by /usr/bin/java) 
libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 (0x7f77e34f9000) Error 
relocating /usr/bin/java: JLI_Launch: symbol not found
{code}
*Official Docker Image Ignite 2.7 (Server, containers on K8s)*
{code:java}
/opt/ignite # java -version
openjdk version "1.8.0_181"
OpenJDK Runtime Environment (IcedTea 3.9.0) (Alpine 8.181.13-r0)
OpenJDK 64-Bit Server VM (build 25.181-b13, mixed mode)
{code}
{code:java}
/opt/ignite # ldd /usr/bin/java /lib/ld-musl-x86_64.so.1 (0x7f80fe498000) Error 
loading shared library libjli.so: No such file or directory (needed by 
/usr/bin/java) libc.musl-x86_64.so.1 => /lib/ld-musl-x86_64.so.1 
(0x7f80fe498000) Error relocating /usr/bin/java: JLI_Launch: symbol not found
{code}
With the above setup, the first 2-4 clients would connect fine, but after day 
they more clients we add, the clients would start to struggled to connect to 
the server. . Once we reach to around 10 clients connected to the server. Visor 
would work fine up to 2-4 clients, after that, visor would not connect at all.

 

After a bit of reading we identified the problem might have been Alpine with 
musl. We changed all our Microservices to use another base image, Alpine with 
Libc.

*Java Microservices (Clients, containers on K8s)*
{code:java}
// containers base image (java jre with libc)
adoptopenjdk/openjdk8:alpine-slim{code}
{code:java}
/ # java -version openjdk version "1.8.0_212" OpenJDK Runtime Environment 
(AdoptOpenJDK)(build 1.8.0_212-b03) OpenJDK 64-Bit Server VM 
(AdoptOpenJDK)(build 25.212-b03, mixed mode)
{code}
 
{code:java}
/ # ldd /opt/java/openjdk/bin/java
/lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
libjli.so => /opt/java/openjdk/bin/../lib/amd64/jli/libjli.so (0x7f4efbe46000)
libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7f4efc05d000)
Error relocating /opt/java/openjdk/bin/../lib/amd64/jli/libjli.so: __rawmemchr: 
symbol not found
{code}
With the above, clients (~10 clients) would now connect to the server. But we 
here, visor would then again struggle to connect.

 

We went a bit further an build a custom Ignite 2.7 Docker Image using 
adoptopenjdk/openjdk8:alpine-slim. Using 
[https://github.com/apache/ignite/tree/2.7.0/docker/apache-ignite] as the setup.

*Custom Docker Image Ignite 2.7 (Server, containers on K8s)*
{code:java}
/opt/ignite # java -version
openjdk version "1.8.0_212"
OpenJDK Runtime Environment (AdoptOpenJDK)(build 1.8.0_212-b03)
OpenJDK 64-Bit Server VM (AdoptOpenJDK)(build 25.212-b03, mixed mode)
{code}
{code:java}
 /opt/ignite # ldd /opt/java/openjdk/bin/java /lib64/ld-linux-x86-64.so.2 
(0x7fadc3193000) libpthread.so.0 => /lib64/ld-linux-x86-64.so.2 
(0x7fadc3193000) libjli.so => /opt/java/openjdk/bin/../lib/amd64/jli/libjli.so 
(0x7fadc2f7c000) libdl.so.2 => /lib64/ld-linux-x86-64.so.2 (0x7fadc3193000) 
libc.so.6 => /lib64/ld-linux-x86-64.so.2 (0x7fadc3193000) Error relocating 
/opt/java/openjdk/bin/../lib/amd64/jli/libjli.so: __rawmemchr: symbol not found
{code}
We this change, clients, server and visor showed no network/connectivity 
problems (scaled to 10 clients).

> clients fails to connect
> ------------------------
>
>                 Key: IGNITE-11842
>                 URL: https://issues.apache.org/jira/browse/IGNITE-11842
>             Project: Ignite
>          Issue Type: Bug
>          Components: cache
>    Affects Versions: 2.7
>         Environment: kubernetes
>  
>            Reporter: James
>            Priority: Major
>
> The main symptom is that clients are failing to connect to the ignite 
> cluster, with reported timeouts in the logs.
> The main fact we have is this (from within the client within a kubernetes 
> container on Linux):
> / # netstat -ntp
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address           Foreign Address         State     
>   PID/Program name
> tcp   215796      0 ::ffff:10.42.2.97:43666 ::ffff:10.42.3.170:47500 
> ESTABLISHED 13/java
>  
> Namely, the application is failing to read data from the tcp socket. Notice 
> the “Recv-Q” of 215796.
>  
> This could be an client application, but also the same thing happens with 
> ignitevisor.sh
> Downgrading to Apache Ignite 2.3 resolves the problem.
> Testes so far:
> 2.7  intermittently fails to connect to the ignite cluster.
> 2.3 seems OK.
> 2.6 also fails after a number of clients have connected successfully.
>  
> Has anyone else seen this?
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to