2020-10-21 18:31:01 UTC - Brendan Doyle: We're having issues with two high 
scale functions mapping to the same home invoker resulting in lots of container 
recreations because it's swapping between the two functions. I'm trying to see 
if we can tinker anything with configs that might help. I see that `pauseGrace` 
is default 50 milliseconds. My theory is that if I increase this it essentially 
puts a lock on the container while waiting for another run of the function so 
it should cause less swaps and just more containers should fall over to 
non-home invokers for these two functions. Does that understanding check out?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603305061114600?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 18:35:13 UTC - Brendan Doyle: And follow up, any operators out there 
changed this default and had important negative side effects I should know 
about before playing with it?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603305313114700?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 18:37:01 UTC - Dave Grove: Does anyone have recent experience using 
zipkin or similar tracing tool with OpenWhisk?   I found @James Thomas’s 
project (<https://github.com/jthomas/zipkin-instrumentation-openwhisk>), but 
since it was from 2017 I wasn’t sure if that was a good place to start, or if 
there was something newer.
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603305421116300?thread_ts=1603305421.116300&cid=C3TPCAQG1
----
2020-10-21 18:40:29 UTC - Dave Grove: I believe `pauseGrace`` only actually 
does anything with the DockerContainerFactory.  It’s basically how long should 
the invoker allow an idle container to run before doing a `docker pause` on it. 
  The motivation is to prevent clever users from sneakily executing background 
computation between billable foreground invocations of functions in a container.
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603305629116400?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 18:41:27 UTC - Brendan Doyle: Yea I'm looking more closely at the 
code now. I was hoping that it wouldn't attempt to remove the container if it 
wasn't paused so it acts as a pseudo lock, but that doesn't seem to be the case
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603305687116600?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 18:41:39 UTC - Dave Grove: With the KubernetesContainerFactory, we 
don’t have the same ability to do `docker pause` and `docker unpause` on 
containers, so although you can set this to different values it won’t do 
anything.
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603305699116800?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 18:46:44 UTC - parichehr vahidinia: how to set different values for 
_idle-container_ and _pause-grace_ with the _kubernetesContainerFactory_ and 
also _DockerContainerFactory?_
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603306004118000?thread_ts=1603306004.118000&cid=C3TPCAQG1
----
2020-10-21 19:39:14 UTC - Dave Grove: The general trick for overriding the 
default values that are set in the various .conf files is to define environment 
variables that start with CONFIG_ in the invoker/controller pods.  For example, 
there is a property `whisk.loadbalancer.blackboxFraction.`  You set the 
environment variable `CONFIG_whisk_loadbalancer_blackboxFraction` to set a 
different value.  If you look into invoker-pod.yaml and controller-pod.yaml in 
the OpenWhisk helm chart you will see quite a few examples of this being done.
+1 : parichehr vahidinia
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603309154118300?thread_ts=1603306004.118000&cid=C3TPCAQG1
----
2020-10-21 20:22:55 UTC - Rodric Rabbah: @Dave Grove since you answered the 
question here do you want to post it to stackoverflow?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603311775118900?thread_ts=1603261638.111000&cid=C3TPCAQG1
----
2020-10-21 20:25:13 UTC - parichehr vahidinia: @Dave Grove any ideas on this?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603311913119100?thread_ts=1602925036.090800&cid=C3TPCAQG1
----
2020-10-21 20:27:14 UTC - Rodric Rabbah: for internal tracing?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312034119400?thread_ts=1603305421.116300&cid=C3TPCAQG1
----
2020-10-21 20:31:43 UTC - Rodric Rabbah: the pause grace does allow a container 
to stay unpaused longer, and if there is another activation in the 
corresponding invoker’s q that can use that container, it makes the container 
more likely to be reused (vs starting another container, or having to unpause a 
container)
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312303119700?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:32:32 UTC - Rodric Rabbah: it’s not a lock in the way you thought 
about it
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312352119900?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:35:09 UTC - Brendan Doyle: interesting, what makes it more likely 
to be reused? I'm working through the code right now and it doesn't seem like 
container pool has knowledge of whether it's paused or not.

But yea our issue is actual  container removals from one function for another 
when the invoker is full and then just swapping back and forth between them. 
Any ideas on how we might be able to mitigate that before the new pull based 
scheduler?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312509120100?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:35:52 UTC - Rodric Rabbah: it delays the state transition from 
running to paused
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312552120400?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:36:19 UTC - Rodric Rabbah: if you find the actor/state machine 
that manages the container life cycle, there should be one transition that’s 
affected / delayed
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312579120600?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:36:35 UTC - Rodric Rabbah: add another invoker 
:slightly_smiling_face:
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312595120800?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:36:47 UTC - Rodric Rabbah: are the functions from the same user?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312607121000?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:37:28 UTC - Rodric Rabbah: you might be better of looking at the 
invoker hashing in the load balancer
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312648121200?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:37:43 UTC - Rodric Rabbah: this is a performance pathology with 
the current scheduler, unfortunately
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312663121400?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:38:26 UTC - Brendan Doyle: we have invoker space in the fleet, 
the problem is the two functions hash to the same invoker but yea it should 
redistribute the hashes with a new invoker but still vulnerable. Our current 
resolution is bringing down an invoker to rehash things ha
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312706121700?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:38:52 UTC - Rodric Rabbah: :face_palm:
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312732121900?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:40:12 UTC - Brendan Doyle: yea I'm curious if we could do 
something fancy with the invoker hashing quickly in the load balancer that's 
not a big change until we have the new scheduler
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312812122100?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:40:15 UTC - Brendan Doyle: I'll look into that
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603312815122300?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:45:40 UTC - Brendan Doyle: Follow up question, when we run into 
this issue we get a ton of the `
```s"Rescheduling Run message, too many message in the pool, "```
logs. I'm wondering if theres anything else I can deduce that we could help 
with configurations. Does blowing up the runBuffer cause things to 
significantly slow down?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603313140122500?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:46:00 UTC - parichehr vahidinia: @Dave Grove Excuse me, I am a 
novice. Can you please introduce me a document to understand how to set 
environment variables?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603313160122700?thread_ts=1603306004.118000&cid=C3TPCAQG1
----
2020-10-21 20:46:20 UTC - Rodric Rabbah: i don’t know - this message used to 
indicate a bug but there’s been changes in that area of the scheduler which i 
haven’t followed
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603313180122900?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:47:14 UTC - Brendan Doyle: gotcha thanks! do you happen to have 
any idea what the bug used to be?
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603313234123100?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:48:30 UTC - Rodric Rabbah: i could be misleading you because it’s 
been a while --- it used to mean the state of the resource table (container 
allocation) was not consistent with the pipeline that feeds the invoker: pulled 
one too many messages and until a container is free to reconcile the state, 
you’ll get that message printed
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603313310123300?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 20:49:10 UTC - Brendan Doyle: yea that sounds about right
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603313350123500?thread_ts=1603305061.114600&cid=C3TPCAQG1
----
2020-10-21 21:55:37 UTC - Dave Grove: From the context, I’m guessing the ask is 
really for user-level tracing of actions, but references to either internal or 
external usage could be helpful
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603317337123700?thread_ts=1603305421.116300&cid=C3TPCAQG1
----
2020-10-21 22:12:26 UTC - Dave Grove: There is unfortunately not a lot of 
written documentation.   You can imitate the way it is done for other 
environment variables by editing the .yaml files for the controller or invoker 
pods.   For example, 
<https://github.com/apache/openwhisk-deploy-kube/blob/master/helm/openwhisk/templates/invoker-pod.yaml#L192-#L193>
 and 
<https://github.com/apache/openwhisk-deploy-kube/blob/master/helm/openwhisk/values.yaml#L272>
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603318346123900?thread_ts=1603306004.118000&cid=C3TPCAQG1
----
2020-10-21 22:14:03 UTC - parichehr vahidinia: thank you :hibiscus:
https://openwhisk-team.slack.com/archives/C3TPCAQG1/p1603318443124200?thread_ts=1603306004.118000&cid=C3TPCAQG1
----

Reply via email to