[ 
https://issues.apache.org/jira/browse/UIMA-5605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16212784#comment-16212784
 ] 

Eddie Epstein commented on UIMA-5605:
-------------------------------------

This bug is exposed in situations where there is active work on a node and then 
"usable" memory shrinks. Usable memory is defined as machine memory size minus 
RAM used by "system" processes. System processes are all those owned by UID < 
ducc.agent.rogue.process.sys.uid.max, which has a default value of 500. The 
data above suggests that usable memory had shrunk by 30GB on machine xxx.xxx.xx.

The motivation for this design is to be able to guarantee user processes the 
requested amount of RAM. DUCC currently does not stop running processes when 
usable memory shrinks, but does not want to deploy new processes that will not 
fit in free RAM remaining. The exception here occurred when looking to pre-empt 
existing processes for a new, higher priority process.

The problem is reproduced by running a preemptable job process on a node, 
having user root grab enough RAM to reduce the [quantized] usable memory of the 
node, and then trying to allocate a non-preempable process on the node.


> DUCC scheduler ArrayIndexOutOfBoundsException
> ---------------------------------------------
>
>                 Key: UIMA-5605
>                 URL: https://issues.apache.org/jira/browse/UIMA-5605
>             Project: UIMA
>          Issue Type: Bug
>          Components: DUCC
>    Affects Versions: 2.2.1-Ducc
>            Reporter: Jörg W
>             Fix For: future-DUCC
>
>
> The scheduler stops scheduling and ducc-mon indicates inresposive 
> ResourceManager.
> rm.log (Trace):
> 05 Okt 2017 11:29:13,336 TRACE RM.NodepoolScheduler- N/A detectFragmentation  
> Freed shares 246 on machine xxx.xxx.xx
> 05 Okt 2017 11:29:13,336 TRACE RM.NodepoolScheduler- N/A detectFragmentation  
> Update v before: NP[ --default-- ] v:   0   0   0   0   0   0   0   0   0   0 
>   0   0   0   0   0 0   0   0   0   0   0   0  
> 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
>   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   
> 0   0   0   0   0   0   0   0   0   0   0   0   
> 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
>   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   
> 0   0   0   0   0   0   0   0   0   0   0   0   0
> 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
>   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   
> 0   0   0   0   0   0   0   0   0   0   0   0 
> 0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0   0 
>   0   0   0   0   0   0   0   0   0   0   0   0 
> 05 Okt 2017 11:29:13,336 FATAL RM.ResourceManagerComponent- N/A runScheduler 
> java.lang.ArrayIndexOutOfBoundsException
> An ArrayIndexOutOfBoundsException can occur in the NodepoolScheduler class at 
> line 2422:
> vmach_j[free]++;  
> Quickfix: comment it out. It seems only be used for logging (trace).
>  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to