[ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:
------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.2.0
           Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> add consistent node replacement to LLAP for splits
> --------------------------------------------------
>
>                 Key: HIVE-14589
>                 URL: https://issues.apache.org/jira/browse/HIVE-14589
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Assignee: Sergey Shelukhin
>             Fix For: 2.2.0
>
>         Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch, 
> HIVE-14589.03.patch, HIVE-14589.04.patch, HIVE-14589.patch
>
>
> See HIVE-14574. (copied from the comment below) This basically creates the 
> nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest 
> available slot, starting from 0. Unlike worker-... nodes, the slots are 
> reused, which is the intent. The LLAPs are always sorted by the slot number 
> for splits.
> The idea is that as long as LLAP is running, it will retain the same position 
> in the ordering, regardless of other LLAPs restarting, without knowing about 
> each other, the predecessors location (if restarted in a different place), or 
> the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors 
> (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter 
> because they have lost their cache anyway.
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
> they will take whatever slots, but 3 will stay the 3rd and retain cache 
> locality.
> This also handles size increase, as new LLAPs will always be added to the end 
> of the sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be 
> a permanent gap if LLAPs are removed that have the slots in the middle; until 
> some are restarted, it will result in misses



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to