[ 
https://issues.apache.org/jira/browse/FLINK-9932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638954#comment-16638954
 ] 

ASF GitHub Bot commented on FLINK-9932:
---------------------------------------

isunjin commented on a change in pull request #6780: [FLINK-9932] [runtime] fix 
slot leak when task executor offer slot to job master timeout
URL: https://github.com/apache/flink/pull/6780#discussion_r222842321
 
 

 ##########
 File path: 
flink-runtime/src/main/java/org/apache/flink/runtime/taskexecutor/TaskExecutor.java
 ##########
 @@ -1076,6 +1076,19 @@ private void offerSlotsToJobManager(final JobID jobId) {
                                                        if (throwable 
instanceof TimeoutException) {
                                                                log.info("Slot 
offering to JobManager did not finish in time. Retrying the slot offering.");
                                                                // We ran into 
a timeout. Try again.
+                                                               for (SlotOffer 
offer : reservedSlots) {
+                                                                       try {
+                                                                               
if (!taskSlotTable.markSlotInactive(offer.getAllocationId(), 
taskManagerConfiguration.getTimeout())) {
+                                                                               
        // the slot is either free or releasing at the moment
+                                                                               
        final String message = "Could not mark slot " + jobId + " active.";
 
 Review comment:
   merge the log to a single line.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> If task executor offer slot to job master timeout the first time, the slot 
> will leak
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-9932
>                 URL: https://issues.apache.org/jira/browse/FLINK-9932
>             Project: Flink
>          Issue Type: Bug
>          Components: Cluster Management
>    Affects Versions: 1.5.0
>            Reporter: shuai.xu
>            Assignee: shuai.xu
>            Priority: Major
>              Labels: pull-request-available
>
> When task executor offer slot to job master, it will first mark the slot as 
> active.
> If the offer slot call timeout, the task executor will try to call 
> offerSlotsToJobManager again,
> but it will only offer the slot in ALLOCATED state. As the slot has already 
> be mark ACTIVE, it will never be offered and this will cause slot leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to