Re: [DISCUSS] KIP-663: API to Start and Shut Down Stream Threads and to Request Closing of Kafka Streams Clients

Matthias J. Sax Tue, 08 Sep 2020 13:59:33 -0700

I am not sure if we need a new method? There is already
`localThreadsMetadata()`. What do we gain by adding a new one?


Returning the thread's name (as `Optional<String>`) for both add() and
remove() is fine with me.


-Matthias

On 9/8/20 12:58 PM, Sophie Blee-Goldman wrote:
> Sorry Bruno, I think I missed the end of your message with the
> numberOfAliveStreamThreads()
> proposal. I agree, that would be better than the alternatives I listed.
> That said:
> 
>> They rather suggest that the method returns a list of handles to the
> stream threads.
> 
> I hadn't thought of that originally, but now that you mention it, this
> might be a good idea.
> I don't think we should return actual handles on the threads, but maybe a
> list of the thread
> names rather than a single number of currently alive threads.
> 
> Since we seem to think it would be difficult if not impossible to keep
> track of the number
> of running stream threads, we should apply the same reasoning to the names
> and not
> assume the user can/will keep track of every thread returned by
> addStreamThread() or
> removeStreamThread(). Users should generally take any required action
> immediately
> after adding/removing the thread -- eg deregistering the thread metrics --
> but it might
> still be useful to provide a convenience method listing all of the current
> threads
> 
> And of course you could still get the number of threads easily by invoking
> size() on the
> returned list (or ordered set?).
> 
> On Tue, Sep 8, 2020 at 12:16 PM Bruno Cadonna <br...@confluent.io> wrote:
> 
>> Thank you again for the feedback Sophie!
>>
>> As I tried to point out in my previous e-mail, removing a stream thread
>> from a Kafka Streams client that does not have alive stream threads is
>> nothing exceptional for the client per se. However, it can become
>> exceptional within the context of the user. For example, if users want
>> to remove a stream thread from a client without alive stream threads
>> because one if their metrics say so, then this is exceptional in the
>> context of that user metric not in the context of the Kafka Streams
>> client. In that case, users should throw an exception and handle it.
>>
>> Regarding returning null, I do not like to return null because from a
>> development point of view there is no distinction between returning null
>> because we have a bug in the code or returning null because there are no
>> alive stream threads. Additionally, Optional<String> makes it more
>> explicit that the result could also be empty.
>>
>> Thank you for the alternative method names! However, with the names you
>> propose it is not immediately clear that the method returns an amount of
>> stream threads. They rather suggest that the method returns a list of
>> handles to the stream threads. I chose to use "aliveStreamThreads" to be
>> consistent with the client-level metric "alive-stream-threads" which
>> reports the same number of stream threads that
>> numberOfAliveStreamThreads() should report. If others also think that
>> the proposed name in the KIP is too clumsy, I am open to rename it, though.
>>
>> Best,
>> Bruno
>>
>>
>> On 08.09.20 20:12, Sophie Blee-Goldman wrote:
>>>> it's never a good sign when the discussion moves into the vote thread
>>>
>>> Hah, sorry, the gmail consolidation of [VOTE] and [DISCUSS] threads
>> strikes
>>> again.
>>> Thanks for redirecting me Bruno
>>>
>>> I suppose it's unfair to expect the callers to keep perfect track of the
>>> current
>>>   number of stream threads, but it also seems like you shouldn't be
>> calling
>>> removeStreamThread() when there are no threads left. Either you're just
>>> haphazardly removing threads and could unintentionally slip into a state
>> of
>>> no
>>> running threads without realizing it, or more realistically, you're
>>> carefully
>>> removing threads based on some metric(s) that convey whether the system
>> is
>>> over or under-provisioned. If your metrics say you're over-provisioned
>> but
>>> there's
>>> not one thread running, well, that certainly sounds exceptional to me. Or
>>> you might
>>> be right in that the cluster is over-provisioned but have just been
>>> directing the
>>> removeStreamThread() and addStreamThread() calls to instances at random,
>> and
>>> end up with one massive instance and one with no threads at all. Again,
>>> this
>>> probably merits some human intervention (or system redesign)
>>>
>>> That said, I don't think there's any real harm to just returning null in
>>> this case, but I hope
>>> that users would pay attention to this since it seems likely to indicate
>>> something has gone
>>> seriously wrong. I suppose Optional<String> would be a reasonable
>>> compromise.
>>>
>>> As for the method name, what about activeStreamThreads() or
>>> liveStreamThreads() ?
>>>
>>> On Mon, Sep 7, 2020 at 1:45 AM Bruno Cadonna <br...@confluent.io> wrote:
>>>
>>>> Hi John,
>>>>
>>>> I agree with you except for checking null. I would rather prefer to use
>>>> Optional<String> as the return type to both methods.
>>>>
>>>> I changed the subject from [VOTE] to [DISCUSS] so that we can follow up
>>>> in the discussion thread.
>>>>
>>>> Best,
>>>> Bruno
>>>>
>>>> On 04.09.20 23:12, John Roesler wrote:
>>>>> Hi Sophie,
>>>>>
>>>>> Uh, oh, it's never a good sign when the discussion moves
>>>>> into the vote thread :)
>>>>>
>>>>> I agree with you, it seems like a good touch for
>>>>> removeStreamThread() to return the name of the thread that
>>>>> got removed, rather than a boolean flag. Maybe the return
>>>>> value would be `null` if there is no thread to remove.
>>>>>
>>>>> If we go that way, I'd suggest that addStreamThread() also
>>>>> return the name of the newly created thread, or null if no
>>>>> thread can be created right now.
>>>>>
>>>>> I'm not completely sure if I think that callers of this
>>>>> method would know exactly how many threads there are. Sure,
>>>>> if a human being is sitting there looking at the metrics or
>>>>> logs and decides to call the method, it would work out, but
>>>>> I'd expect this kind of method to find its way into
>>>>> automated tooling that reacts to things like current system
>>>>> load or resource saturation. Those kinds of toolchains often
>>>>> are part of a distributed system, and it's probably not that
>>>>> easy to guarantee that the thread count they observe is
>>>>> fully consistent with the number of threads that are
>>>>> actually running. Therefore, an in-situ `int
>>>>> numStreamThreads()` method might not be a bad idea. Then
>>>>> again, it seems sort of optional. A caller can catch an
>>>>> exception or react to a `null` return value just the same
>>>>> either way. Having both add/remove methods behave similarly
>>>>> is probably more valuable.
>>>>>
>>>>> Thanks,
>>>>> -John
>>>>>
>>>>>
>>>>> On Thu, 2020-09-03 at 12:15 -0700, Sophie Blee-Goldman
>>>>> wrote:
>>>>>> Hey, sorry for the late reply, I just have one minor suggestion. Since
>>>> we
>>>>>> don't
>>>>>> make any guarantees about which thread gets removed or allow the user
>> to
>>>>>> specify, I think we should return either the index or full name of the
>>>>>> thread
>>>>>> that does get removed by removeThread().
>>>>>>
>>>>>> I know you just updated the KIP to return true/false if there
>>>> are/aren't any
>>>>>> threads to be removed, but I think this would be more appropriate as
>> an
>>>>>> exception than as a return type. I think it's reasonable to expect
>>>> users to
>>>>>> have some sense to how many threads are remaining, and not try to
>> remove
>>>>>> a thread when there is none left. To me, that indicates something
>> wrong
>>>>>> with the user application code and should be treated as an exceptional
>>>> case.
>>>>>> I don't think the same code clarify argument applies here as to the
>>>>>> addStreamThread() case, as there's no reason for an application to be
>>>>>> looping and retrying removeStreamThread()  since if that fails, it's
>>>> because
>>>>>> there are no threads left and thus it will continue to always fail.
>> And
>>>> if
>>>>>> the
>>>>>> user actually wants to shut down all threads, they should just close
>> the
>>>>>> whole application rather than call removeStreamThread() in a loop.
>>>>>>
>>>>>> While I generally think it should be straightforward for users to
>> track
>>>> how
>>>>>> many stream threads they have running, maybe it would be nice to add
>>>>>> a small utility method that does this for them. Something like
>>>>>>
>>>>>> // Returns the number of currently alive threads
>>>>>> boolean runningStreamThreads();
>>>>>>
>>>>>> On Thu, Sep 3, 2020 at 7:41 AM Matthias J. Sax <mj...@apache.org>
>>>> wrote:
>>>>>>
>>>>>>> +1 (binding)
>>>>>>>
>>>>>>> On 9/3/20 6:16 AM, Bruno Cadonna wrote:
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I would like to start the voting on KIP-663 that proposes to add
>>>> methods
>>>>>>>> to the Kafka Streams client to add and remove stream threads during
>>>>>>>> execution.
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-663%3A+API+to+Start+and+Shut+Down+Stream+Threads
>>>>>>>>
>>>>>>>> Best,
>>>>>>>> Bruno
>>>>>
>>>>
>>>
>>
>

signature.asc
Description: OpenPGP digital signature

Re: [DISCUSS] KIP-663: API to Start and Shut Down Stream Threads and to Request Closing of Kafka Streams Clients

Reply via email to