HAO SU created MESOS-10232:
------------------------------

             Summary: Old sandboxes not being GC'ed caused frequent Mesos GC
                 Key: MESOS-10232
                 URL: https://issues.apache.org/jira/browse/MESOS-10232
             Project: Mesos
          Issue Type: Bug
          Components: agent
            Reporter: HAO SU


Customers reported that their logs (sandbox files) are missing soon after the 
job completes. Mesos agent logs indicate that the files were GC-ed within 
minutes of container exit. Checking the host, there were a lot of old sandboxes 
dating back to Jan 2020. These are occupying a lot of space (~88% of all 
sandbox usage) and likely causing frequent GC of recently running containers. 

Mesos does recognize these sandbox and try to schedule them for deletion
{code:java}
 I0902 18:02:27.511576 467334 gc.cpp:95] Scheduling 
'/var/lib/mesos/meta/slaves/68caec4c-6ea5-44e7-9f8-fad1922d5-S162/frameworks/3dcc744f-016c-6579-9b82-6325402d2-9999/executors/fa00-29a3-4c47-95fd-808d52ac53-13-1'
 for gc -85.5641509780737weeks in the future
{code}
but the deletion seems to never happen.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to