[jira] [Commented] (MESOS-10143) Outstanding Offers accumulating

2020-07-08 Thread Puneet Kumar (Jira)


[ 
https://issues.apache.org/jira/browse/MESOS-10143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17153796#comment-17153796
 ] 

Puneet Kumar commented on MESOS-10143:
--

[~greggomann] , 

In following example, 3 offers were sent by Mesos Master at time 05:59:07 but 
received by Scheduler at 06:16:26

+Logs from Mesos Master:+

I0708 05:59:07.097918 5776 master.cpp:9722] Sending offers [ 
64bb9634-4038-448b-8323-a9877f51f524-O10488046, 
64bb9634-4038-448b-8323-a9877f51f524-O10488047, 
64bb9634-4038-448b-8323-a9877f51f524-O10488048 ] to framework 
black-falcon-scheduler-puneetku (Black Falcon) at 
scheduler-a1fe5552-c044-4ff0-800c-9dbae2c27c45@10.162.31.219:40391


+Logs from Mesos native library that has java bindings after setting GLOG_v=3 :+

I0708 06:16:26.478953 80704 sched.cpp:890] Received 3 offers
I0708 06:16:26.478965 80704 pid.cpp:91] Attempting to parse 
'slave(1)@10.89.65.85:5051' into a PID
I0708 06:16:26.478971 80704 sched.cpp:900] Saving PID 
'slave(1)@10.89.65.85:5051'
I0708 06:16:26.478981 80704 pid.cpp:91] Attempting to parse 
'slave(1)@10.89.103.172:5051' into a PID
I0708 06:16:26.478986 80704 sched.cpp:900] Saving PID 
'slave(1)@10.89.103.172:5051'
I0708 06:16:26.478993 80704 pid.cpp:91] Attempting to parse 
'slave(1)@10.88.84.46:5051' into a PID
I0708 06:16:26.478999 80704 sched.cpp:900] Saving PID 
'slave(1)@10.88.84.46:5051'
I0708 06:16:26.485853 80704 sched.cpp:914] Scheduler::resourceOffers took 
6.788912ms


+Logs from Scheduler's resourceOffers method that is implemented in Java:+

08 Jul 2020 06:16:26,479 ESC[32m[INFO]ESC[m (Thread-14611242) 
com.blackfalconservice.mesos.scheduler.MesosScheduler$$EnhancerByGuice$$c8e6e266:
 resourceOffers: Offer Received 
64bb9634-4038-448b-8323-a9877f51f524-O10488046,64bb9634-4038-448b-8323-a9877f51f524-O10488047,64bb9634-4038-448b-8323-a9877f51f524-O10488048


All the offers that were sent between 05:59:07 and 06:16:26 were counted as 
outstanding_offers by Mesos master. This gap in time keeps on increasing from 
minutes to hours until I restart the Scheduler process. How can I find out 
where these outstanding offers are queued up for minutes before being offered 
to Scheduler and why is there a delay of minutes?

 

> Outstanding Offers accumulating
> ---
>
> Key: MESOS-10143
> URL: https://issues.apache.org/jira/browse/MESOS-10143
> Project: Mesos
>  Issue Type: Bug
>  Components: master, scheduler driver
>Affects Versions: 1.7.0
> Environment: Mesos Version 1.7.0
> JDK 8.0
>Reporter: Puneet Kumar
>Priority: Minor
>
> We manage an Apache Mesos cluster version 1.7.0. We have written a framework 
> in Java that schedules tasks to Mesos master at a rate of 300 TPS. Everything 
> works fine for almost 24 hours but then outstanding offers accumulate & 
> saturate within 15 minutes. Outstanding offers aren't reclaimed by Mesos 
> master. We observe "RescindOffer" messages in verbose (GLOG v=3) framework 
> logs but outstanding offers don't reduce. New resources aren't offered to 
> framework when outstanding offers saturate. We have to restart the scheduler 
> to reset outstanding offers to zero.
> Any suggestions to debug this issue are welcome.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MESOS-10148) Update the `CSIPluginInfo` protobuf message for supporting 3rd party CSI plugins

2020-07-08 Thread Qian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/MESOS-10148?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang reassigned MESOS-10148:
--

Assignee: Qian Zhang

RR: [https://reviews.apache.org/r/72661/]

> Update the `CSIPluginInfo` protobuf message for supporting 3rd party CSI 
> plugins
> 
>
> Key: MESOS-10148
> URL: https://issues.apache.org/jira/browse/MESOS-10148
> Project: Mesos
>  Issue Type: Task
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Major
>
> See 
> [here|https://docs.google.com/document/d/1NfWLS2OdiYjXZa2dpd_DOWOK4eou-SedY396Jl68s9Y/edit#bookmark=id.x6m8mytigrg7]
>  for the detailed design.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (MESOS-10147) Introduce a new volume type `CSI` into the `Volume` protobuf message

2020-07-08 Thread Qian Zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/MESOS-10147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Zhang reassigned MESOS-10147:
--

Assignee: Qian Zhang

RR: [https://reviews.apache.org/r/72660/]

> Introduce a new volume type `CSI` into the `Volume` protobuf message
> 
>
> Key: MESOS-10147
> URL: https://issues.apache.org/jira/browse/MESOS-10147
> Project: Mesos
>  Issue Type: Task
>Reporter: Qian Zhang
>Assignee: Qian Zhang
>Priority: Major
>
> See 
> [here|https://docs.google.com/document/d/1NfWLS2OdiYjXZa2dpd_DOWOK4eou-SedY396Jl68s9Y/edit#heading=h.l7wa1w8789pg]
>  for the detailed design.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)