[ https://issues.apache.org/jira/browse/HIVE-10617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14529664#comment-14529664 ]
Sergey Shelukhin commented on HIVE-10617: ----------------------------------------- Will do this later. With 6 executors x 8 nodes I see an average of 0.5 allocation retries (not task retries) per 1000 tasks in a query reading entire lineitem from TPCH 1Tb scale. So it's annoying to have these retries, but not super important. > LLAP: fix allocator concurrency rarely causing spurious failure to allocate > due to "partitioned" locking > -------------------------------------------------------------------------------------------------------- > > Key: HIVE-10617 > URL: https://issues.apache.org/jira/browse/HIVE-10617 > Project: Hive > Issue Type: Sub-task > Reporter: Sergey Shelukhin > Assignee: Sergey Shelukhin > > See HIVE-10482 and the comment in code. Right now this is worked around by > retrying. > Simple case - thread can reserve memory from manager and bounce between > checking arena 1 and arena 2 for memory as other threads allocate and > deallocate from respective arenas in reverse order, making it look like > there's no memory. More importantly this can happen when buddy blocks are > split when lots of stuff is allocated. > This can be solved either with some form of helping (esp. for split case) or > by making allocator an "actor" (or set of actors, one per 1-N arenas that > they would own), to satisfy alloc requests more deterministically (and also > get rid of most sync). -- This message was sent by Atlassian JIRA (v6.3.4#6332)