[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-09-02 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

   Resolution: Fixed
Fix Version/s: 2.2.0
   Status: Resolved  (was: Patch Available)

Committed to master. Thanks for the review!

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 2.2.0
>
> Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch, 
> HIVE-14589.03.patch, HIVE-14589.04.patch, HIVE-14589.patch
>
>
> See HIVE-14574. (copied from the comment below) This basically creates the 
> nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest 
> available slot, starting from 0. Unlike worker-... nodes, the slots are 
> reused, which is the intent. The LLAPs are always sorted by the slot number 
> for splits.
> The idea is that as long as LLAP is running, it will retain the same position 
> in the ordering, regardless of other LLAPs restarting, without knowing about 
> each other, the predecessors location (if restarted in a different place), or 
> the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors 
> (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter 
> because they have lost their cache anyway.
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
> they will take whatever slots, but 3 will stay the 3rd and retain cache 
> locality.
> This also handles size increase, as new LLAPs will always be added to the end 
> of the sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be 
> a permanent gap if LLAPs are removed that have the slots in the middle; until 
> some are restarted, it will result in misses



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-09-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: HIVE-14589.04.patch

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch, 
> HIVE-14589.03.patch, HIVE-14589.04.patch, HIVE-14589.patch
>
>
> See HIVE-14574. (copied from the comment below) This basically creates the 
> nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest 
> available slot, starting from 0. Unlike worker-... nodes, the slots are 
> reused, which is the intent. The LLAPs are always sorted by the slot number 
> for splits.
> The idea is that as long as LLAP is running, it will retain the same position 
> in the ordering, regardless of other LLAPs restarting, without knowing about 
> each other, the predecessors location (if restarted in a different place), or 
> the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors 
> (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter 
> because they have lost their cache anyway.
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
> they will take whatever slots, but 3 will stay the 3rd and retain cache 
> locality.
> This also handles size increase, as new LLAPs will always be added to the end 
> of the sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be 
> a permanent gap if LLAPs are removed that have the slots in the middle; until 
> some are restarted, it will result in misses



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-31 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: HIVE-14589.03.patch

Adding the tests (3 from curator pretty much, 3 new) and addressing RB feedback.

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch, 
> HIVE-14589.03.patch, HIVE-14589.patch
>
>
> See HIVE-14574. (copied from the comment below) This basically creates the 
> nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest 
> available slot, starting from 0. Unlike worker-... nodes, the slots are 
> reused, which is the intent. The LLAPs are always sorted by the slot number 
> for splits.
> The idea is that as long as LLAP is running, it will retain the same position 
> in the ordering, regardless of other LLAPs restarting, without knowing about 
> each other, the predecessors location (if restarted in a different place), or 
> the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors 
> (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter 
> because they have lost their cache anyway.
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
> they will take whatever slots, but 3 will stay the 3rd and retain cache 
> locality.
> This also handles size increase, as new LLAPs will always be added to the end 
> of the sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be 
> a permanent gap if LLAPs are removed that have the slots in the middle; until 
> some are restarted, it will result in misses



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-30 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: HIVE-14589.02.patch

Cannot repro minillap failure; same patch to re-check

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.02.patch, 
> HIVE-14589.patch
>
>
> See HIVE-14574. (copied from the comment below) This basically creates the 
> nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest 
> available slot, starting from 0. Unlike worker-... nodes, the slots are 
> reused, which is the intent. The LLAPs are always sorted by the slot number 
> for splits.
> The idea is that as long as LLAP is running, it will retain the same position 
> in the ordering, regardless of other LLAPs restarting, without knowing about 
> each other, the predecessors location (if restarted in a different place), or 
> the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors 
> (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter 
> because they have lost their cache anyway.
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
> they will take whatever slots, but 3 will stay the 3rd and retain cache 
> locality.
> This also handles size increase, as new LLAPs will always be added to the end 
> of the sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be 
> a permanent gap if LLAPs are removed that have the slots in the middle; until 
> some are restarted, it will result in misses



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-26 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Description: 
See HIVE-14574. (copied from the comment below) This basically creates the 
nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest 
available slot, starting from 0. Unlike worker-... nodes, the slots are reused, 
which is the intent. The LLAPs are always sorted by the slot number for splits.
The idea is that as long as LLAP is running, it will retain the same position 
in the ordering, regardless of other LLAPs restarting, without knowing about 
each other, the predecessors location (if restarted in a different place), or 
the total size of the cluster.
The restarting LLAPs may not take the same positions as their predecessors 
(i.e. if two LLAPs restart they can swap slots) but it shouldn't matter because 
they have lost their cache anyway.
I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
they will take whatever slots, but 3 will stay the 3rd and retain cache 
locality.

This also handles size increase, as new LLAPs will always be added to the end 
of the sequence, which is what consistent hashing needs.

One case it doesn't handle is permanent cluster size reduction. There will be a 
permanent gap if LLAPs are removed that have the slots in the middle; until 
some are restarted, it will result in misses

  was:See HIVE-14574


> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574. (copied from the comment below) This basically creates the 
> nodes in ZK for "slots" in the cluster. The LLAPs try to take the lowest 
> available slot, starting from 0. Unlike worker-... nodes, the slots are 
> reused, which is the intent. The LLAPs are always sorted by the slot number 
> for splits.
> The idea is that as long as LLAP is running, it will retain the same position 
> in the ordering, regardless of other LLAPs restarting, without knowing about 
> each other, the predecessors location (if restarted in a different place), or 
> the total size of the cluster.
> The restarting LLAPs may not take the same positions as their predecessors 
> (i.e. if two LLAPs restart they can swap slots) but it shouldn't matter 
> because they have lost their cache anyway.
> I.e. if you have LLAPs with slots 1-2-3-4 and I nuke and restart 1, 2, and 4, 
> they will take whatever slots, but 3 will stay the 3rd and retain cache 
> locality.
> This also handles size increase, as new LLAPs will always be added to the end 
> of the sequence, which is what consistent hashing needs.
> One case it doesn't handle is permanent cluster size reduction. There will be 
> a permanent gap if LLAPs are removed that have the slots in the middle; until 
> some are restarted, it will result in misses



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-24 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: HIVE-14589.01.patch

Rebased the patch. [~sseth] [~prasanth_j] ping? ;)

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.01.patch, HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Status: Patch Available  (was: Open)

[~prasanth_j] ready for review ;)

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: HIVE-14589.patch

The patch including HIVE-14574 (for HiveQA; since that patch is small anyway).
Some version of this worked (restored LLAPs come up in the same order, although 
there's a gap for a time, I almost wonder if it makes sense to insert bogus 
instances for empty slots when getting them); after that there were some 
changes, I'll test again

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-22 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: (was: HIVE-14589.WIP.patch)

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HIVE-14589) add consistent node replacement to LLAP for splits

2016-08-19 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-14589?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HIVE-14589:

Attachment: HIVE-14589.WIP.patch

WIP patch on top of the other jIRA.
Not sure if it works yet, was having some trouble with the cluster earlier. 
Will try next week.

[~prasanth_j] [~sseth] fyi

> add consistent node replacement to LLAP for splits
> --
>
> Key: HIVE-14589
> URL: https://issues.apache.org/jira/browse/HIVE-14589
> Project: Hive
>  Issue Type: Bug
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Attachments: HIVE-14589.WIP.patch
>
>
> See HIVE-14574



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)