[ https://issues.apache.org/jira/browse/YUNIKORN-1790?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Qi Zhu resolved YUNIKORN-1790. ------------------------------ Fix Version/s: 1.4.0 Resolution: Fixed > TestTryAllocatePreemptNode: flaky test > -------------------------------------- > > Key: YUNIKORN-1790 > URL: https://issues.apache.org/jira/browse/YUNIKORN-1790 > Project: Apache YuniKorn > Issue Type: Improvement > Components: test - unit > Reporter: Wilfred Spiegelenburg > Assignee: Qi Zhu > Priority: Major > Labels: flaky-test, pull-request-available > Fix For: 1.4.0 > > > The TestTryAllocatePreemptNode fails sometimes with a different allocation > being preempted than the expected alloc1: > {code:java} > 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured > queue added to scheduler {"queueName": "root"} > 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:1190 updating root > queue max resources {"current max": "nil resource", "new max": > "map[first:40]"} > 2023-06-07T12:40:35.035+1000 DEBUG objects/queue.go:850 new parent > queue inheriting template from parent queue {"child queue": "root.parent", > "parent queue": "root"} > 2023-06-07T12:40:35.035+1000 INFO objects/queue.go:139 configured > queue added to scheduler {"queueName": "root.parent"} > 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources > setting ignored: cannot set zero max resources > 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:350 guaranteed > resources setting ignored: cannot set zero max resources > 2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured > queue added to scheduler {"queueName": "root.unlimited"} > 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources > setting ignored: cannot set zero max resources2023-06-07T12:40:35.036+1000 > INFO objects/queue.go:139 configured queue added to scheduler > {"queueName": "root.parent.child1"} > 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:343 max resources > setting ignored: cannot set zero max resources > 2023-06-07T12:40:35.036+1000 INFO objects/queue.go:139 configured > queue added to scheduler {"queueName": "root.parent.child2"} > 2023-06-07T12:40:35.036+1000 INFO objects/application.go:184 Unknown > gang scheduling style, using soft style as default {"gang scheduling > style": ""} > 2023-06-07T12:40:35.036+1000 DEBUG objects/queue.go:1780 Unknown > application {"applicationID": "app-0"} > 2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133 > Application state transition {"appID": "app-0", "source": "New", > "destination": "Accepted", "event": "runApplication"} > 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask > added successfully to application {"appID": "app-0", "ask": "alloc0-0", > "placeholder": false, "pendingDelta": "map[first:11]"} > 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask > added successfully to application {"appID": "app-0", "ask": "alloc0-1", > "placeholder": false, "pendingDelta": "map[first:11]"} > 2023-06-07T12:40:35.037+1000 INFO objects/application.go:184 Unknown > gang scheduling style, using soft style as default {"gang scheduling > style": ""} > 2023-06-07T12:40:35.037+1000 DEBUG objects/queue.go:1780 Unknown > application {"applicationID": "app-1"} > 2023-06-07T12:40:35.037+1000 INFO objects/application_state.go:133 > Application state transition {"appID": "app-1", "source": "New", > "destination": "Accepted", "event": "runApplication"} > 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask > added successfully to application {"appID": "app-1", "ask": "alloc1", > "placeholder": false, "pendingDelta": "map[first:5]"} > 2023-06-07T12:40:35.037+1000 INFO objects/application.go:668 ask > added successfully to application {"appID": "app-1", "ask": "alloc2", > "placeholder": false, "pendingDelta": "map[first:5]"} > 2023-06-07T12:40:35.038+1000 INFO objects/application.go:184 Unknown > gang scheduling style, using soft style as default {"gang scheduling > style": ""} > 2023-06-07T12:40:35.038+1000 DEBUG objects/queue.go:1780 Unknown > application {"applicationID": "app-2"} > 2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133 > Application state transition {"appID": "app-2", "source": "New", > "destination": "Accepted", "event": "runApplication"} > 2023-06-07T12:40:35.038+1000 INFO objects/application.go:668 ask > added successfully to application {"appID": "app-2", "ask": "alloc3", > "placeholder": false, "pendingDelta": "map[first:5]"} > 2023-06-07T12:40:35.038+1000 DEBUG objects/application.go:339 > Application state timer initiated {"appID": "app-0", "state": > "Starting", "timeout": "5m0s"} > 2023-06-07T12:40:35.038+1000 INFO objects/application_state.go:133 > Application state transition {"appID": "app-0", "source": "Accepted", > "destination": "Starting", "event": "runApplication"} > 2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:63 Increasing > resource usage {"user": "testuser", "queue path": "root.unlimited", > "application": "app-0", "resource": "map[first:11]"} > 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:45 Creating queue > tracker object for queue {"queue": "unlimited"} > 2023-06-07T12:40:35.038+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.038+1000 DEBUG ugm/manager.go:265 Group tracker > already exists and linking (reusing) the same with application > {"application": "app-0", "queue path": "root.unlimited", "user": "testuser", > "group": "testgroup"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:45 Creating queue > tracker object for queue {"queue": "unlimited"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.039+1000 DEBUG objects/application.go:388 > Application state timer cleared {"appID": "app-0", "state": "Starting"} > 2023-06-07T12:40:35.039+1000 INFO objects/application_state.go:133 > Application state transition {"appID": "app-0", "source": "Starting", > "destination": "Running", "event": "runApplication"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/manager.go:63 Increasing > resource usage {"user": "testuser", "queue path": "root.unlimited", > "application": "app-0", "resource": "map[first:11]"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.039+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "unlimited", "application": "app-0", > "resource": "map[first:11]"} > 2023-06-07T12:40:35.040+1000 DEBUG objects/application.go:339 > Application state timer initiated {"appID": "app-1", "state": > "Starting", "timeout": "5m0s"} > 2023-06-07T12:40:35.040+1000 INFO objects/application_state.go:133 > Application state transition {"appID": "app-1", "source": "Accepted", > "destination": "Starting", "event": "runApplication"} > 2023-06-07T12:40:35.040+1000 DEBUG ugm/manager.go:63 Increasing > resource usage {"user": "testuser", "queue path": "root.parent.child1", > "application": "app-1", "resource": "map[first:5]"} > 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.parent.child1", "application": > "app-1", "resource": "map[first:5]"} > 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "parent.child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.parent.child1", "application": > "app-1", "resource": "map[first:5]"} > 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "parent.child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.040+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:388 > Application state timer cleared {"appID": "app-1", "state": "Starting"} > 2023-06-07T12:40:35.041+1000 INFO objects/application_state.go:133 > Application state transition {"appID": "app-1", "source": "Starting", > "destination": "Running", "event": "runApplication"} > 2023-06-07T12:40:35.041+1000 DEBUG ugm/manager.go:63 Increasing > resource usage {"user": "testuser", "queue path": "root.parent.child1", > "application": "app-1", "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.parent.child1", "application": > "app-1", "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "parent.child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "root.parent.child1", "application": > "app-1", "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "parent.child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG ugm/queue_tracker.go:57 Increasing > resource usage {"queue path": "child1", "application": "app-1", > "resource": "map[first:5]"} > 2023-06-07T12:40:35.041+1000 DEBUG objects/application.go:1454 app > reservation check {"allocationKey": "alloc3", "createTime": > "2023-06-07T12:40:25.038+1000", "askAge": "10.003814459s", > "reservationDelay": "2s"} > 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app > reservation check {"allocationKey": "alloc3", "createTime": > "2023-06-07T12:40:25.038+1000", "askAge": "10.003914s", "reservationDelay": > "2s"} > 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1470 found > candidate node for app reservation {"appID": "app-2", "nodeID": > "node1", "allocationKey": "alloc3", "reservations": 0, "pendingRepeats": 1} > 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app > reservation check {"allocationKey": "alloc3", "createTime": > "2023-06-07T12:39:55.038+1000", "askAge": "40.004148584s", > "reservationDelay": "2s"} > 2023-06-07T12:40:35.042+1000 DEBUG objects/application.go:1454 app > reservation check {"allocationKey": "alloc3", "createTime": > "2023-06-07T12:39:55.038+1000", "askAge": "40.004227875s", > "reservationDelay": "2s"} > 2023-06-07T12:40:35.042+1000 DEBUG objects/preemption.go:352 No RM > callback plugin registered, using first selected node for preemption > {"NodeID": "node2", "AllocationKey": "alloc3"} > 2023-06-07T12:40:35.043+1000 ERROR objects/preemption.go:517 BUG: > Didn't find instance type in the nodeInstanceTypeMap {"nodeId": "node2"} > github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).tryNodes > > /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:517 > github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Preemptor).TryPreemption > > /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/preemption.go:537 > github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryPreemption > > /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1327 > github.com/apache/yunikorn-core/pkg/scheduler/objects.(*Application).tryAllocate > > /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application.go:1050 > github.com/apache/yunikorn-core/pkg/scheduler/objects.TestTryAllocatePreemptNode > > /Users/wilfred/Downloads/yunikorn-1.3.0/apache-yunikorn-1.3.0-src/core/pkg/scheduler/objects/application_test.go:1874 > testing.tRunner /usr/local/go/src/testing/testing.go:1576 > 2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:559 > Preempting task {"applicationID": "app-1", "allocationKey": "alloc2", > "nodeID": "node2", "resources": "map[first:5]"} > 2023-06-07T12:40:35.043+1000 INFO objects/preemption.go:580 > Reserving node for ask after preemption {"allocationKey": "alloc3", "nodeID": > "node2", "victimCount": 1} > --- FAIL: TestTryAllocatePreemptNode (0.06s) > application_test.go:1876: assertion failed: expression is false: > alloc1.IsPreempted(): alloc1 should have been preempted {code} -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@yunikorn.apache.org For additional commands, e-mail: dev-h...@yunikorn.apache.org