Thanks a lot. Will look into it

On Fri, Jan 11, 2019 at 6:18 PM Wang Jiajun <[email protected]> wrote:

> Hi Kishore,
>
> I have sent a pull request to fix the first 2 issues.
> https://github.com/apache/helix/pull/297
> As for the 3rd one, it requires a much larger scope of change. And
> actually, it does not break any logic now after we fixed the ephemeral node
> owner validate logic. We think it can be scheduled for future release.
>
> Best Regards,
> Jiajun
>
>
> On Mon, Jan 7, 2019 at 3:57 PM Wang Jiajun <[email protected]> wrote:
>
>> Resending. Reply to all.
>>
>> We can probably fix the first 2 issues within 2 weeks, considering the
>> additional test and validation required.
>> For issue 1, we can make the original reset into 2 methods. For new
>> session handling, we should not interrupt. For client closing, we shall
>> interrupt thread and shut down.
>> For issue 2, we need to try catch for zookeeper NPE in addition.
>>
>> Issue 3 will take more time since we need to change both ZkClient and
>> event handler. There may be some interfaces need to be updated. Moreover,
>> it changes the current ZkClient behavior. So we'd better run it in the test
>> environment for a longer time.
>>
>> With the ephemeral node's owner fixed, the 3rd issue does not impact
>> correctness. So maybe we can plan for fixing the first 2 issues first? And
>> then plan for the 3rd issue in the next release? If that's the case, we
>> shall have a release candidate after 2 weeks.
>>
>> Best Regards,
>> Jiajun
>>
>>
>> On Mon, Jan 7, 2019 at 3:14 PM kishore g <[email protected]> wrote:
>>
>>> I think the pending issues are the ones that are affecting us. What does
>>> it take to fix those issues?
>>>
>>> On Mon, Jan 7, 2019 at 2:54 PM Wang Jiajun <[email protected]>
>>> wrote:
>>>
>>>> Hi Kishore,
>>>>
>>>> Hope you are doing well.
>>>> Since last time we met to discuss potential ZkClient improvements in
>>>> Helix, we have completed the fix of one issue. However, the resolving of
>>>> the whole list will take more time, given Pinot is still waiting for the
>>>> new release, I'd like to hear your opinion that whether we shall release
>>>> 0.8.3 based on the current situation.
>>>>
>>>> Fixed issues:
>>>>
>>>>    1. For an Ephemeral node, the source of truth should be the owner
>>>>    session Id instead of the node content.
>>>>    This fixes the leader election issue we found in Pinot cluster.
>>>>
>>>> Pending issues:
>>>>
>>>>    1. ZkClient should not interrupt the callback handling during
>>>>    session reestablishment or other reset logic. Interrupt for shutdown 
>>>> should
>>>>    only happen when things are closed. For fixing this problem, we need to
>>>>    think about how to handle thread leaking.
>>>>    2. ZkConnection.getZookeeper() == null potentially cause
>>>>    retryUntilConnect to terminate earlier than expected. Should keep 
>>>> waiting
>>>>    for this error.
>>>>    3. The ZkClient event should keep a session Id. The event processor
>>>>    can discard expired event.
>>>>
>>>> Best Regards,
>>>> Jiajun
>>>>
>>>

Reply via email to