2022-07-06 20:52:20 UTC - piby: Hey all,
We are evaluating different serverless platforms for our k8s cluster. We have
spent a couple of hours today trying to install openwhisk on EKS 1.20 but
unfortunately weren't able to make it work.
There are limited logs and multiple containers are in “pod initializing” state
with no way to debug it.
Any help would be super useful to us. Thanks!
values.yaml
```whisk:
ingress:
# NOTE: Replace <domain> with your cluster's actual domain
apiHostName: <http://test.xxx.xxx.com|test.xxx.xxx.com>
apiHostPort: 443
apiHostProto: https
type: Standard
useInternally: false
# NOTE: Replace <domain> with your cluster's actual domain
domain: <http://test.xxx.xxx.com|test.xxx.xxx.com>
invoker:
options: "-Dwhisk.kubernetes.user-pod-node-affinity.enabled=false"
containerFactory:
impl: kubernetes
affinity:
enabled: false
toleration:
enabled: false
k8s:
domain: cluster.local
dns: kube-dns.kube-system
persistence:
enabled: true
hasDefaultStorageClass: false
explicitStorageClass: efs-csi-openwhisk
metrics:
# set true to enable prometheus exporter
prometheusEnabled: true
# passing prometheus-enabled by a config file, required by openwhisk
whiskconfigFile: "whiskconfig.conf"
# set true to enable Kamon
kamonEnabled: false
# set true to enable Kamon tags
kamonTags: false
# set true to enable user metrics
userMetricsEnabled: true```
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1657140740810409
----
2022-07-06 21:08:03 UTC - Bilal: I have a self managed Openwhisk deployment
running in EKS (kube). Currently we are doing just over 100,000 activations per
day. Hitting about a 0.5% system error rate with reponse code 3: Failed to run
container. The majority of my actions are blackbox (I have blackbox percent set
to 100%), however they are small docker files that simply extend existing OW
python containers by installed a few more packages (eg `pip install redis`). At
one point I had a 0% system error rate
I've done most of the
<https://github.com/apache/openwhisk-deploy-kube/blob/master/docs/k8s-custom-build-cluster-scaleup.md|recommendations
here>, I assume at this point I'm Large scale. Linking values in :thread:
At this point I'm not sure if there's an obvious config that I missed or if
there are additional considerations at this scale? I have replicacount set to 4
for controller/invoker but only 1 for elasticsearch activationStoreBackend. Not
sure if that should also be increased.
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1657141683129629?thread_ts=1657141683.129629&cid=C3TPCAQG1
----