Adam Antal created YUNIKORN-390:
-----------------------------------
Summary: SIGSEGV if parent queue does not exist for tag rule
Key: YUNIKORN-390
URL: https://issues.apache.org/jira/browse/YUNIKORN-390
Project: Apache YuniKorn
Issue Type: Bug
Components: shim - kubernetes
Affects Versions: 0.10
Reporter: Adam Antal
The scheduler has crashed if the parent specified for the tag placement rule is
not existing.
The bug is in this line:
{code:go}
if info.GetQueue(parentName).IsLeafQueue() {
return "", fmt.Errorf("parent rule returned a leaf queue: %s", parentName)
}
{code}
{{info.GetQueue(parentName)}} returns nil, which causes the crash.
Full stack trace:
{noformat}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x198b707]
goroutine 116 [running]:
github.com/apache/incubator-yunikorn-core/pkg/cache.(*QueueInfo).IsLeafQueue(...)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/cache/queue_info.go:198
github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*tagRule).placeApplication(0xc005d50050,
0xc000494100, 0xc0006bc210, 0xc00644a300, 0x2, 0x2, 0x10502b1)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/placement/tag_rule.go:93
+0xb47
github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*AppPlacementManager).PlaceApplication(0xc005d50000,
0xc000494100, 0x0, 0x0)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/placement/placement.go:141
+0x485
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*partitionSchedulingContext).addSchedulingApplication(0xc0002e20e0,
0xc005b36120, 0x0, 0x0)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/scheduling_partition.go:108
+0x892
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterSchedulingContext).addSchedulingApplication(0xc000012000,
0xc005b36120, 0x0, 0x0)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/scheduling_context.go:114
+0x1d5
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).addNewApplication(0xc000390000,
0xc000494100, 0xc000738121, 0x9)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/scheduler.go:209
+0x277
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).processApplicationUpdateEvent(0xc000390000,
0xc00a7541e0)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/scheduler.go:447
+0x9ec
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).handleSchedulerEvent(0xc000390000)
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/scheduler.go:596
+0x40a
created by
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).StartService
/Users/adamantal/go/pkg/mod/github.com/apache/[email protected]/pkg/scheduler/scheduler.go:67
+0x9e
{noformat}
I also attach the placement rule, but note that I was working on YUNIKORN-368,
so the code is not 100% the same:
{noformat}
partitions:
- name: default
placementrules:
- name: tag
value: namespace
create: true
parent:
name: tag
value: "namespace.parentqueue"
create: true
queues:
- name: root
submitacl: '*'
queues:
- name: default
submitacl: '*'
{noformat}
where the {{namespace.parentqueue}} is set to "root.special".
My proposal is that even if the queue does not exist, it shouldn't crash. Let's
make a double check before doing getting the {{QueueInfo}} object.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]