Actually to Dom’s point, rather than UUID you can probably use the ip or
display name value to store as the key to guarantee uniqueness, but will be
reused on restart. And still no longer have to worry about tracking the
counter.

Brendan

On Tue, Dec 20, 2022 at 8:54 PM Brendan Doyle <bdoyle0...@gmail.com> wrote:

> I think it would be nice to make the invoker ids “memoryless” in that they
> just get a new id on restart. And this value can just be stored in etcd or
> any key value data store. The two major things that need to be dealt with
> on the new architecture to deal with that are 1. The healthcheck function
> is created using the invoker id as the name of the function so we need to
> make sure the function is deleted otherwise you’ll endlessly create the
> healthcheck functions and 2. the container creation kafka topics since
> they’re somewhat tied to the invoker and it’s assigned id. A new container
> creation topic would get created every time the invoker is restarted if it
> gets a new uuid every time (which maybe that’s okay as long as we can clean
> up old ones). Basically this all just boils down to we just need to make
> sure we’re accounting for cleaning up all side effect data that gets
> created by a new invoker uuid when it gets shut down.
>
> The ack / results kafka topics wouldn’t have to be adjusted to achieve
> this because they’re coordinated by the controllers.
>
> On the topic of removing kafka from the critical path, I’m in total
> agreement with Dom I think that would be a great architectural improvement
> goal for 2023. Since openwhisk is already at most once at this point, I
> don’t think kafka is adding any value from a persistence perspective and we
> now have zero down time graceful restarts to drain any in progress work
> before stopping. I presume most of the original value was from an ordering
> perspective giving some level of fairness of processing between components.
>
>
>
> On Tue, Dec 20, 2022 at 7:48 PM Dominic Kim <style9...@gmail.com> wrote:
>
>> We can also consider ETCD.
>> ETCD supports a transaction API so we can make sure only one invoker can
>> be
>> assigned an IP at a time.
>> (Though ETCD is currently only enabled when a scheduler is enabled.)
>>
>> Aside from this, I also wanted to get rid of Kafka from the critical path
>> too as we can now use the scheduler(the new messaging queue).
>> Since Kafka is being used to deliver ContainerCreation messages and Ack +
>> Result messages, it would not be a simple task but we are already using
>> Akka-grpc and Akka-remote(cluster).
>> We can migrate to them.
>>
>> -dom
>>
>>
>>
>>
>>
>>
>> [image: Mailtrack]
>> <
>> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;
>> >
>> Sender
>> notified by
>> Mailtrack
>> <
>> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;
>> >
>> 22.
>> 12. 21. 오전 09:45:27
>>
>> 2022년 12월 21일 (수) 오전 6:25, Michele Sciabarra <mich...@sciabarra.com>님이
>> 작성:
>>
>> > I have a question:  what if we just use UUID?
>> > That would be easy to implement.
>> > Unless there are other reasons because the id should be incremental
>> > numbers.
>> > I would like to try to get rid of this dependency because I really want
>> to
>> > update Kafka maybe using RedPanda or Kafka Kraft...
>> >
>> > --
>> >   Michele Sciabarra
>> >   mich...@sciabarra.com
>> >
>> > ----- Original message -----
>> > From: Brendan Doyle <bdoyle0...@gmail.com>
>> > To: dev@openwhisk.apache.org
>> > Subject: Re: Updating Kafka getting rid of Zookeeper
>> > Date: Sunday, December 18, 2022 4:36 PM
>> >
>> > Yes this is one of the things I had in my list on slack of what I would
>> > like to see accomplished in 2023 since the only zookeeper use is storing
>> > the invoker id mappings. I think it should be doable, but the one tricky
>> > part is the synchronization that the zookeeper counter provides.
>> Whatever
>> > new mechanism that is used, it must be guaranteed that two new invokers
>> > starting at the same time wouldn’t have a race condition where they read
>> > the current counter value and assign themselves the same id. Or just the
>> > way invoker ids are represented could be changed entirely as well to be
>> > more accommodating to auto scaling up and down too; since part of the
>> > reason for the original incrementing integer invoker ids was the hashing
>> > algorithm to schedule activations to invokers which now no longer
>> exists. I
>> > think an invoker node also no longer has to have a persisted id that is
>> > reused on restart because there’s no longer an invoker activation kafka
>> > topic with the invoker id where there may be activations left to still
>> be
>> > processed after restart.
>> >
>> > But of all the things work wise I have on my list, this is probably the
>> > easiest value proposition to remove an entire dependency from they
>> system.
>> >
>> > Brendan
>> >
>> > On Sun, Dec 18, 2022 at 8:36 AM Michele Sciabarra <
>> mich...@sciabarra.com>
>> > wrote:
>> >
>> > > Hello,
>> > >
>> > > I was trying to run OpenWhisk with a newer version of Kafka and also
>> with
>> > > RedPanda. And I failed.
>> > > I found there is a dependency on version 4 of curator that in turns
>> > blocks
>> > > zookeper to version 3.4 and newer version of Kafka uses 3.5 with a
>> > > different protocol.
>> > >
>> > > In the code, the curator is only used here:
>> > >
>> > >
>> > >
>> >
>> core/invoker/src/main/scala/org/apache/openwhisk/core/invoker/InstanceIdAssigner.scala
>> > >
>> > > what is exactly the purpose of this class?
>> > >
>> > > From what I can undestand, it is storing in zooker the id of the
>> invoker
>> > > for the purpose of guaranteeing
>> > >
>> > > Can we replace with something else that does not use zookeeper? We
>> have
>> > > now etcd for the scheduler if I am not wrong. Can we use etcd and get
>> rid
>> > > of the dependency on zookeepr 3.4 that holds back also the version of
>> > > kafka, so we can experiment with Kafka Kraft and RedPanda (that does
>> not
>> > > have zookeeper at all)?
>> > >
>> > > --
>> > >   Michele Sciabarra
>> > >   mich...@sciabarra.com
>> > >
>> >
>>
>

Reply via email to