Hi Matthew, Our Helm support is a recent addition, and came from another external contributor. See the Pull Request at https://github.com/Metaswitch/clearwater-docker/pull/85 for the details :) As it stands at the moment, the chart is good enough for deploying and re-creating a full standard deployment through Helm, but I don't believe it handles more of the complexities of upgrading a clearwater deployment that it potentially could.
We haven't yet done any significant work in setting up Helm charts, or integrating with them in a more detailed manner, so if that's something you're interested in as well, we'd love to work with you to get some more enhancements in. Especially if you have other expert contacts who know more in this area. (I'm removing some of the thread in the email below, to keep us below the list limits. The online archives will keep all the info though) Cheers, Adam From: Davis, Matthew [mailto:matthew.davi...@team.telstra.com] Sent: 25 May 2018 08:30 To: Adam Lindley <adam.lind...@metaswitch.com>; clearwater@lists.projectclearwater.org Subject: RE: [Project Clearwater] Issues with clearwater-docker homestead and homestead-prov under Kubernetes Hi Adam, Thanks for that. I haven't had a chance to try your latest suggestion or do more work on AKS yet. I just have a quick question. Should the helm charts be mostly empty? I spoke to someone at Microsoft and he was surprised at how short Chart.yaml is. Chart.yaml: ``` apiVersion: v1 description: A Helm chart for Clearwater name: clearwater version: 0.1.0 ``` values.yaml ``` # Default values for clearwater. # This is a YAML-formatted file. # Declare variables to be passed into your templates. image: path: {{IMAGE_PATH}} tag: {{IMAGE_TAG}} ``` Thanks, Matthew Davis Telstra Graduate Engineer CTO | Cloud SDN NFV From: Adam Lindley [mailto:adam.lind...@metaswitch.com] Sent: Friday, 25 May 2018 3:05 AM To: Davis, Matthew <matthew.davi...@team.telstra.com<mailto:matthew.davi...@team.telstra.com>>; clearwater@lists.projectclearwater.org<mailto:clearwater@lists.projectclearwater.org> Cc: Richard Whitehouse (projectclearwater.org) <richard.whiteho...@projectclearwater.org<mailto:richard.whiteho...@projectclearwater.org>>; kie...@aptira.com<mailto:kie...@aptira.com> Subject: RE: [Project Clearwater] Issues with clearwater-docker homestead and homestead-prov under Kubernetes Hey Matthew, Sorry, I think I was a bit unclear, and have also noticed I missed out one step in my previous message. I'll try to clarify a bit here. > you're having issues connecting to Bono because the service/pod is not exposed By this, I meant that your Bono service is not exposed outside the kubernetes cluster. Other pods would have been able to reach it just fine, but your test setup was trying to reach into it from outside the host, and there was no config in place mapping it through to allow that to happen (e.g. nodePort, LoadBalancer type configuration). The step I missed out was that we want to change the bono service port on the inside of the pod as well, to match the one we set up on the host. This is because Bono will be record-routing itself using the PUBLIC_IP and pcscf port, so if a SIP client tried to respond using this route, it would by default be sent back to port 5060. Because we can't open that up as a nodePort on the Kubernetes host, this would cause issues in the SIP flows. As for why we are specifically interested in port 5060; this is the external P-CSCF port. I.e. this is the interface we want to send all our mainline SIP flows through. The others are used for webrtc and restund, neither of which we are particularly interested in at the moment, and neither of which are critical for getting calls through. So, I think we just want to make a couple of tweaks to the bono-svc.yaml file. I'm going to copy bono yaml files in at the bottom. Simply we want the values of `port` and `nodePort` under the `-name: "5060"` section to match. With the configuration as in my yaml files, and changing the `--pcscf` option in the bono pod, I see my live tests passing under the following command: `rake test[default.svc.cw-k8s.test] PROXY=<k8s-host-IP PROXY_PORT=32060 ELLIS=<k8s-host-IP:30080` Hopefully you see the same. From the Bono logs you've got below, I think the issue is simply that the above misconfiguration meant traffic couldn't reach the bono service correctly. I think this should be the last step in this set of hurdles; hopefully you see tests and calls working :) Unfortunately, I don't have much experience using weave networking, so can't give much guidance there on how to open the pods up more to the wider network to make this of more use. And if in your testing you are seeing no packets hit any of the pods, that's going to be the first thing we need to debug, as we've probably missed a different piece of network configuration somewhere. On the other thread, have you made any further progress in the Azure Kubernetes Service setup? I would still be very interested to see that up and working too. Hope this helps. Cheers, Adam bono-svc.yaml: ``` apiVersion: v1 kind: Service metadata: name: bono spec: type: NodePort ports: - name: "3478" port: 3478 - name: "5060" port: 32060 nodePort: 32060 - name: "5062" port: 5062 selector: service: bono ``` bono-depl.yaml ``` apiVersion: extensions/v1beta1 kind: Deployment metadata: name: bono spec: replicas: 1 selector: matchLabels: service: bono template: metadata: labels: service: bono snmp: enabled spec: containers: - image: imagePullPolicy: Always name: bono ports: - containerPort: 22 - containerPort: 3478 - containerPort: 5060 - containerPort: 5062 - containerPort: 5060 protocol: "UDP" - containerPort: 5062 protocol: "UDP" envFrom: - configMapRef: name: env-vars env: - name: MY_POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: PUBLIC_IP value: 10.230.16.1 livenessProbe: exec: command: ["/bin/bash", "/usr/share/kubernetes/liveness.sh", "3478 5062"] initialDelaySeconds: 30 readinessProbe: exec: command: ["/bin/bash", "/usr/share/kubernetes/liveness.sh", "3478 5062"] volumeMounts: - name: bonologs mountPath: /var/log/bono - image: busybox name: tailer command: [ "tail", "-F", "/var/log/bono/bono_current.txt" ] volumeMounts: - name: bonologs mountPath: /var/log/bono volumes: - name: bonologs emptyDir: {} restartPolicy: Always ``` From: Davis, Matthew [mailto:matthew.davi...@team.telstra.com] Sent: 24 May 2018 06:20 To: Adam Lindley <adam.lind...@metaswitch.com<mailto:adam.lind...@metaswitch.com>>; clearwater@lists.projectclearwater.org<mailto:clearwater@lists.projectclearwater.org> Cc: Richard Whitehouse (projectclearwater.org) <richard.whiteho...@projectclearwater.org<mailto:richard.whiteho...@projectclearwater.org>>; kie...@aptira.com<mailto:kie...@aptira.com> Subject: RE: [Project Clearwater] Issues with clearwater-docker homestead and homestead-prov under Kubernetes Hi Adam, Thanks for that. > you're having issues connecting to Bono because the service/pod is not exposed What do you mean by that? I've run `kubectl apply -f bono-svc.yaml`. The service is exposed, isn't it? $ kubectl get services NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE bono ClusterIP None <none> 3478/TCP,5060/TCP,5062/TCP 1d or after the pcscf changes: NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE bono NodePort 10.108.162.246 <none> 3478:30078/TCP,5060:30060/TCP,5062:30062/TCP 41m I've cc'ed in my colleague (Kieran) who set up the cluster for me. He said it's using weave net. Contrary to what I thought, nothing will be externally routable. So that's an issue. After adding the nodePorts I can view ellis in a browser (http://10.3.1.76:30080/login.html), and ` nc 10.3.1.76 30080 -v` shows that I can connect to the kubernetes cluster on port 30080. ` nc 10.3.1.76 30060 -v` says connection refused. When that happens I see a packet in tcpdump on the bono pod. I see nothing on tcpdump in bono on the relevant ports when I run the rake tests. I'm curious about the ports for Bono. You keep mentioning port 5060. Is that the only port bono uses? What about 3478 and 5062? I tried your suggestion for the pcscf thing. It didn't work. The stdio for the rake test is below. How can I figure out whether the 403 error is for ellis or bono? The last line of the rake test output says: "Error logs, including Call-IDs of failed calls, are in the 'logfiles' directory". But the logfiles directory is empty. Where can I find more verbose logs? I ran tcpdump inside the bono and ellis pods during the test. If I filter by source IP address I see nothing, so I have to filter by port. The tcpdump command I'm using in bono is `tcpdump -a -vnni any port 5060 or port 5062 or port 3478 or port 30078 or port 30060 or port 30062` Literally all ports are now open on the firewall. Here is my bono-svc.yaml file: ``` apiVersion: v1 kind: Service metadata: name: bono spec: type: NodePort ports: - name: "3478" port: 3478 nodePort: 30078 - name: "5060" port: 5060 nodePort: 30060 - name: "5062" port: 5062 nodePort: 30062 selector: service: bono ``` Here is my bono-depl.yaml file: ``` apiVersion: extensions/v1beta1 kind: Deployment metadata: name: bono spec: replicas: 1 selector: matchLabels: service: bono template: metadata: labels: service: bono snmp: enabled spec: containers: - image: "mlda065/bono:latest" imagePullPolicy: Always name: bono ports: - containerPort: 22 - containerPort: 3478 - containerPort: 5060 - containerPort: 5062 - containerPort: 5060 protocol: "UDP" - containerPort: 5062 protocol: "UDP" envFrom: - configMapRef: name: env-vars env: - name: MY_POD_IP valueFrom: fieldRef: fieldPath: status.podIP - name: PUBLIC_IP value: 10.3.1.76 #:6443 volumeMounts: - name: bonologs mountPath: /var/log/bono - image: busybox name: tailer command: [ "tail", "-F", "/var/log/bono/bono_current.txt" ] volumeMounts: - name: bonologs mountPath: /var/log/bono volumes: - name: bonologs emptyDir: {} imagePullSecrets: - name: myregistrykey restartPolicy: Always ``` Here is my env-vars: ``` $ kubectl describe configmap env-vars Name: env-vars Namespace: default Labels: <none> Annotations: <none> Data ==== ZONE: ---- default.svc.cluster.local Events: <none> ``` Here are my services once deployed: ``` $ kubectl get service NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE astaire ClusterIP None <none> 11311/TCP 22m bono NodePort 10.108.162.246 <none> 3478:30078/TCP,5060:30060/TCP,5062:30062/TCP 22m cassandra ClusterIP None <none> 7001/TCP,7000/TCP,9042/TCP,9160/TCP 22m chronos ClusterIP None <none> 7253/TCP 22m ellis NodePort 10.97.76.168 <none> 80:30080/TCP 22m etcd ClusterIP None <none> 2379/TCP,2380/TCP,4001/TCP 22m homer ClusterIP None <none> 7888/TCP 22m homestead ClusterIP None <none> 8888/TCP 22m homestead-prov ClusterIP None <none> 8889/TCP 22m kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d ralf ClusterIP None <none> 10888/TCP 22m sprout ClusterIP None <none> 5052/TCP,5054/TCP 22m ``` When I restarted the bono service there was a couple of warnings: ``` Defaulting container name to bono. Use 'kubectl describe pod/bono-6dfc579b5-bdgzj -n default' to see all of the containers in this pod. * Restarting Bono SIP Edge Proxy bono /etc/init.d/bono: line 63: ulimit: open files: cannot modify limit: Operation not permitted /etc/init.d/bono: line 64: ulimit: open files: cannot modify limit: Invalid argument 23-05-2018 06:21:52.570 UTC [7f2b793ec7c0] Status utils.cpp:651: Switching to daemon mode ...done. ``` Here's the result of the test: ``` $rake test[default.svc.cw-k8s.test] PROXY=10.3.1.76 PROXY_PORT=30060 SIGNUP_CODE='secret' ELLIS=10.3.1.76:30080 TESTS="Basic call - mainline" Basic Call - Mainline (TCP) - Failed RestClient::Forbidden thrown: - 403 Forbidden - /home/ubuntu/.rvm/gems/ruby-1.9.3-p551/gems/rest-client-1.8.0/lib/restclient/abstract_response.rb:74:in `return !' - /home/ubuntu/.rvm/gems/ruby-1.9.3-p551/gems/rest-client-1.8.0/lib/restclient/request.rb:495:in `process_result' - /home/ubuntu/.rvm/gems/ruby-1.9.3-p551/gems/rest-client-1.8.0/lib/restclient/request.rb:421:in `block in transm it' - /usr/lib/ruby/2.3.0/net/http.rb:853:in `start' - /home/ubuntu/.rvm/gems/ruby-1.9.3-p551/gems/rest-client-1.8.0/lib/restclient/request.rb:413:in `transmit' - /home/ubuntu/.rvm/gems/ruby-1.9.3-p551/gems/rest-client-1.8.0/lib/restclient/request.rb:176:in `execute' - /home/ubuntu/.rvm/gems/ruby-1.9.3-p551/gems/rest-client-1.8.0/lib/restclient/request.rb:41:in `execute' - /home/ubuntu/.rvm/gems/ruby-1.9.3-p551/gems/rest-client-1.8.0/lib/restclient.rb:69:in `post' ``` Here is /var/log/bono/bono_current.txt in the bono pod: ``` 24-05-2018 05:17:22.847 UTC [7f2b6b7fe700] Status alarm.cpp:244: Reraising all alarms with a known state 24-05-2018 05:17:22.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1012.3 alarm 24-05-2018 05:17:22.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1013.3 alarm 24-05-2018 05:17:33.337 UTC [7f2b4b7ee700] Status sip_connection_pool.cpp:428: Recycle TCP connection slot 11 24-05-2018 05:17:52.847 UTC [7f2b6b7fe700] Status alarm.cpp:244: Reraising all alarms with a known state 24-05-2018 05:17:52.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1012.3 alarm 24-05-2018 05:17:52.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1013.3 alarm 24-05-2018 05:18:21.346 UTC [7f2b4b7ee700] Status sip_connection_pool.cpp:428: Recycle TCP connection slot 36 24-05-2018 05:18:22.847 UTC [7f2b6b7fe700] Status alarm.cpp:244: Reraising all alarms with a known state 24-05-2018 05:18:22.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1012.3 alarm 24-05-2018 05:18:22.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1013.3 alarm 24-05-2018 05:18:52.847 UTC [7f2b6b7fe700] Status alarm.cpp:244: Reraising all alarms with a known state 24-05-2018 05:18:52.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1012.3 alarm 24-05-2018 05:18:52.847 UTC [7f2b6b7fe700] Status alarm.cpp:37: sprout issued 1013.3 alarm 24-05-2018 05:18:57.350 UTC [7f2b4b7ee700] Status sip_connection_pool.cpp:428: Recycle TCP connection slot 35 ``` Thanks, Matthew Davis Telstra Graduate Engineer CTO | Cloud SDN NFV
_______________________________________________ Clearwater mailing list Clearwater@lists.projectclearwater.org http://lists.projectclearwater.org/mailman/listinfo/clearwater_lists.projectclearwater.org