There’s also Apache Ratis if you want a RAFT library. Looks promising.

—
Matt Sicker

> On Dec 20, 2022, at 20:00, Brendan Doyle <bdoyle0...@gmail.com> wrote:
> 
> Actually to Dom’s point, rather than UUID you can probably use the ip or
> display name value to store as the key to guarantee uniqueness, but will be
> reused on restart. And still no longer have to worry about tracking the
> counter.
> 
> Brendan
> 
>> On Tue, Dec 20, 2022 at 8:54 PM Brendan Doyle <bdoyle0...@gmail.com> wrote:
>> 
>> I think it would be nice to make the invoker ids “memoryless” in that they
>> just get a new id on restart. And this value can just be stored in etcd or
>> any key value data store. The two major things that need to be dealt with
>> on the new architecture to deal with that are 1. The healthcheck function
>> is created using the invoker id as the name of the function so we need to
>> make sure the function is deleted otherwise you’ll endlessly create the
>> healthcheck functions and 2. the container creation kafka topics since
>> they’re somewhat tied to the invoker and it’s assigned id. A new container
>> creation topic would get created every time the invoker is restarted if it
>> gets a new uuid every time (which maybe that’s okay as long as we can clean
>> up old ones). Basically this all just boils down to we just need to make
>> sure we’re accounting for cleaning up all side effect data that gets
>> created by a new invoker uuid when it gets shut down.
>> 
>> The ack / results kafka topics wouldn’t have to be adjusted to achieve
>> this because they’re coordinated by the controllers.
>> 
>> On the topic of removing kafka from the critical path, I’m in total
>> agreement with Dom I think that would be a great architectural improvement
>> goal for 2023. Since openwhisk is already at most once at this point, I
>> don’t think kafka is adding any value from a persistence perspective and we
>> now have zero down time graceful restarts to drain any in progress work
>> before stopping. I presume most of the original value was from an ordering
>> perspective giving some level of fairness of processing between components.
>> 
>> 
>> 
>>> On Tue, Dec 20, 2022 at 7:48 PM Dominic Kim <style9...@gmail.com> wrote:
>>> 
>>> We can also consider ETCD.
>>> ETCD supports a transaction API so we can make sure only one invoker can
>>> be
>>> assigned an IP at a time.
>>> (Though ETCD is currently only enabled when a scheduler is enabled.)
>>> 
>>> Aside from this, I also wanted to get rid of Kafka from the critical path
>>> too as we can now use the scheduler(the new messaging queue).
>>> Since Kafka is being used to deliver ContainerCreation messages and Ack +
>>> Result messages, it would not be a simple task but we are already using
>>> Akka-grpc and Akka-remote(cluster).
>>> We can migrate to them.
>>> 
>>> -dom
>>> 
>>> 
>>> 
>>> 
>>> 
>>> 
>>> [image: Mailtrack]
>>> <
>>> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;
>>>> 
>>> Sender
>>> notified by
>>> Mailtrack
>>> <
>>> https://mailtrack.io?utm_source=gmail&utm_medium=signature&utm_campaign=signaturevirality11&;
>>>> 
>>> 22.
>>> 12. 21. 오전 09:45:27
>>> 
>>> 2022년 12월 21일 (수) 오전 6:25, Michele Sciabarra <mich...@sciabarra.com>님이
>>> 작성:
>>> 
>>>> I have a question:  what if we just use UUID?
>>>> That would be easy to implement.
>>>> Unless there are other reasons because the id should be incremental
>>>> numbers.
>>>> I would like to try to get rid of this dependency because I really want
>>> to
>>>> update Kafka maybe using RedPanda or Kafka Kraft...
>>>> 
>>>> --
>>>>  Michele Sciabarra
>>>>  mich...@sciabarra.com
>>>> 
>>>> ----- Original message -----
>>>> From: Brendan Doyle <bdoyle0...@gmail.com>
>>>> To: dev@openwhisk.apache.org
>>>> Subject: Re: Updating Kafka getting rid of Zookeeper
>>>> Date: Sunday, December 18, 2022 4:36 PM
>>>> 
>>>> Yes this is one of the things I had in my list on slack of what I would
>>>> like to see accomplished in 2023 since the only zookeeper use is storing
>>>> the invoker id mappings. I think it should be doable, but the one tricky
>>>> part is the synchronization that the zookeeper counter provides.
>>> Whatever
>>>> new mechanism that is used, it must be guaranteed that two new invokers
>>>> starting at the same time wouldn’t have a race condition where they read
>>>> the current counter value and assign themselves the same id. Or just the
>>>> way invoker ids are represented could be changed entirely as well to be
>>>> more accommodating to auto scaling up and down too; since part of the
>>>> reason for the original incrementing integer invoker ids was the hashing
>>>> algorithm to schedule activations to invokers which now no longer
>>> exists. I
>>>> think an invoker node also no longer has to have a persisted id that is
>>>> reused on restart because there’s no longer an invoker activation kafka
>>>> topic with the invoker id where there may be activations left to still
>>> be
>>>> processed after restart.
>>>> 
>>>> But of all the things work wise I have on my list, this is probably the
>>>> easiest value proposition to remove an entire dependency from they
>>> system.
>>>> 
>>>> Brendan
>>>> 
>>>> On Sun, Dec 18, 2022 at 8:36 AM Michele Sciabarra <
>>> mich...@sciabarra.com>
>>>> wrote:
>>>> 
>>>>> Hello,
>>>>> 
>>>>> I was trying to run OpenWhisk with a newer version of Kafka and also
>>> with
>>>>> RedPanda. And I failed.
>>>>> I found there is a dependency on version 4 of curator that in turns
>>>> blocks
>>>>> zookeper to version 3.4 and newer version of Kafka uses 3.5 with a
>>>>> different protocol.
>>>>> 
>>>>> In the code, the curator is only used here:
>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> core/invoker/src/main/scala/org/apache/openwhisk/core/invoker/InstanceIdAssigner.scala
>>>>> 
>>>>> what is exactly the purpose of this class?
>>>>> 
>>>>> From what I can undestand, it is storing in zooker the id of the
>>> invoker
>>>>> for the purpose of guaranteeing
>>>>> 
>>>>> Can we replace with something else that does not use zookeeper? We
>>> have
>>>>> now etcd for the scheduler if I am not wrong. Can we use etcd and get
>>> rid
>>>>> of the dependency on zookeepr 3.4 that holds back also the version of
>>>>> kafka, so we can experiment with Kafka Kraft and RedPanda (that does
>>> not
>>>>> have zookeeper at all)?
>>>>> 
>>>>> --
>>>>>  Michele Sciabarra
>>>>>  mich...@sciabarra.com
>>>>> 
>>>> 
>>> 
>> 

Reply via email to