[ 
https://issues.apache.org/jira/browse/MESOS-434?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu closed MESOS-434.
------------------------

       Resolution: Fixed
    Fix Version/s: 0.14.0
         Assignee: Yan Xu  (was: Benjamin Hindman)

Fixed in MESOS-534.
                
> Process isolator libprocess throws exception
> --------------------------------------------
>
>                 Key: MESOS-434
>                 URL: https://issues.apache.org/jira/browse/MESOS-434
>             Project: Mesos
>          Issue Type: Bug
>            Reporter: Vinod Kone
>            Assignee: Yan Xu
>             Fix For: 0.14.0
>
>
> This occurred during one of the slave recovery tests that calls slave 
> shutdown.
> The process isolator terminated with the following error.
> {code}
> libprocess: process-isolator(379)@10.37.184.103:37325 terminating due to 
> basic_filebuf::underflow error reading the file
> {code}
> Relevant test output, for posterity
> {code}
> I0413 20:35:29.195838 19301 exec.cpp:321] Executor asked to shutdown
> I0413 20:35:29.195864 54312 status_update_manager.cpp:359] Received status 
> update acknowledgement for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of 
> framework 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.195948 54312 status_update_manager.hpp:298] Checkpointing ACK 
> for status update TASK_RUNNING from task 055a6671-8348-4b72-9fde-1c7ff667fa5c 
> of framework 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.195983 19311 exec.cpp:75] Scheduling shutdown of the executor
> Waited on process 19319, returned status 15
> I0413 20:35:29.196202 19312 exec.cpp:382] Executor sending status update for 
> task 055a6671-8348-4b72-9fde-1c7ff667fa5c in state TASK_FAILED
> I0413 20:35:29.196606 54312 status_update_manager.hpp:329] Handling ACK for 
> status update TASK_RUNNING from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of 
> framework 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.196769 54313 slave.cpp:1093] Status update manager 
> successfully handled status update acknowledgement for task 
> 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.198374 54291 slave.cpp:1433] Handling status update 
> TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.198786 54312 status_update_manager.cpp:288] Received status 
> update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of 
> framework 201304132035-1740121354-37325-54287-0000 with checkpoint=true
> I0413 20:35:29.198886 54312 status_update_manager.hpp:298] Checkpointing 
> UPDATE for status update TASK_FAILED from task 
> 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.199542 54312 status_update_manager.hpp:329] Handling UPDATE 
> for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c 
> of framework 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.199611 54312 status_update_manager.cpp:334] Forwarding status 
> update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of 
> framework 201304132035-1740121354-37325-54287-0000 to the master at 
> [email protected]:37325
> I0413 20:35:29.199827 54298 master.cpp:1086] Status update from 
> (4588)@10.37.184.103:37325: task 055a6671-8348-4b72-9fde-1c7ff667fa5c of 
> framework 201304132035-1740121354-37325-54287-0000 is now in state TASK_FAILED
> I0413 20:35:29.199836 54291 slave.cpp:1494] Sending ACK for status update 
> TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000 to executor 
> executor(1)@10.37.184.103:42359
> I0413 20:35:29.200048 54306 sched.cpp:327] Received status update TASK_FAILED 
> from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000 from slave(740)@10.37.184.103:37325
> I0413 20:35:29.200098 54298 master.hpp:300] Removing task with resources 
> cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 
> 201304132035-1740121354-37325-54287-0
> I0413 20:35:29.200218 54306 sched.cpp:360] Sending ACK for status update 
> TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000 to slave(740)@10.37.184.103:37325
> I0413 20:35:29.200290 19318 process.cpp:878] Socket closed while receiving
> I0413 20:35:29.200345 19302 exec.cpp:283] Ignoring ACK for status update of 
> task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000 because the driver is aborted!
> I0413 20:35:29.200369 54293 slave.cpp:1056] Got acknowledgement of status 
> update for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.200515 54292 hierarchical_allocator_process.hpp:544] Recovered 
> cpus=2; mem=1024; ports=[31000-32000]; disk=1024 (total allocatable: cpus=2; 
> mem=1024; ports=[31000-32000]; disk=1024) on slave 
> 201304132035-1740121354-37325-54287-0 from framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.200598 54310 status_update_manager.cpp:359] Received status 
> update acknowledgement for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of 
> framework 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.200690 54310 status_update_manager.hpp:298] Checkpointing ACK 
> for status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c 
> of framework 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.201251 54310 status_update_manager.hpp:329] Handling ACK for 
> status update TASK_FAILED from task 055a6671-8348-4b72-9fde-1c7ff667fa5c of 
> framework 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.201344 54310 status_update_manager.cpp:481] Cleaning up status 
> update stream for task 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.201529 54310 slave.cpp:1093] Status update manager 
> successfully handled status update acknowledgement for task 
> 055a6671-8348-4b72-9fde-1c7ff667fa5c of framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.205672 54292 hierarchical_allocator_process.hpp:660] Found 
> available resources: cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on 
> slave 201304132035-1740121354-37325-54287-0
> I0413 20:35:29.205793 54292 hierarchical_allocator_process.hpp:686] Offering 
> cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 
> 201304132035-1740121354-37325-54287-0 to framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.206107 54292 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 478.92us
> I0413 20:35:29.206265 54293 master.hpp:309] Adding offer with resources 
> cpus=2; mem=1024; ports=[31000-32000]; disk=1024 on slave 
> 201304132035-1740121354-37325-54287-0
> I0413 20:35:29.206428 54293 master.cpp:1327] Sending 1 offers to framework 
> 201304132035-1740121354-37325-54287-0000
> I0413 20:35:29.206624 54293 sched.cpp:282] Received 1 offers
> I0413 20:35:29.215816 54292 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:29.215885 54292 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 106.44us
> I0413 20:35:29.226014 54301 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:29.226085 54301 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 103.05us
> W0413 20:35:29.236067 54298 master.cpp:81] No whitelist given. Advertising 
> offers for all slaves
> I0413 20:35:29.236207 54293 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:29.236299 54293 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 146.62us
> I0413 20:35:29.236767 54312 monitor.cpp:206] Publishing resource usage for 
> executor '055a6671-8348-4b72-9fde-1c7ff667fa5c' of framework 
> '201304132035-1740121354-37325-54287-0000'
> I0413 20:35:29.246213 54309 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:29.246271 54309 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 87.00us
> I0413 20:35:29.256467 54297 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:29.256556 54297 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 129.68us
> ..................
> ..................
>  20:35:30.149065 54296 master.cpp:81] No whitelist given. Advertising offers 
> for all slaves
> I0413 20:35:30.149341 54310 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:30.149401 54310 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 102.93us
> I0413 20:35:30.149641 54296 monitor.cpp:206] Publishing resource usage for 
> executor '055a6671-8348-4b72-9fde-1c7ff667fa5c' of framework 
> '201304132035-1740121354-37325-54287-0000'
> I0413 20:35:30.159216 54311 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:30.159286 54311 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 120.69us
> I0413 20:35:30.169424 54304 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:30.169503 54304 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 119.82us
> I0413 20:35:30.179558 54301 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:30.179635 54301 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 106.45us
> I0413 20:35:30.189718 54297 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:30.189789 54297 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 106.73us
> I0413 20:35:30.197923 54316 process.cpp:878] Socket closed while receiving
> W0413 20:35:30.199760 54299 master.cpp:81] No whitelist given. Advertising 
> offers for all slaves
> I0413 20:35:30.199870 54311 hierarchical_allocator_process.hpp:668] No 
> resources available to allocate!
> I0413 20:35:30.199952 54311 hierarchical_allocator_process.hpp:599] Performed 
> allocation for 1 slaves in 121.93us
> libprocess: process-isolator(379)@10.37.184.103:37325 terminating due to 
> basic_filebuf::underflow error reading the file
> W0413 20:35:30.200603 54310 monitor.cpp:212] Failed to collect resource usage 
> for executor '055a6671-8348-4b72-9fde-1c7ff667fa5c' of framework 
> '201304132035-1740121354-37325-54287-0000': 0
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to