[ 
https://issues.apache.org/jira/browse/YUNIKORN-390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam Antal updated YUNIKORN-390:
--------------------------------
    Description: 
The scheduler has crashed if the parent specified for the tag placement rule is 
not existing.
The bug is in this line ({{core/pkg/scheduler/placement/tag_rule.go#93}})
{code:go}
if info.GetQueue(parentName).IsLeafQueue() {
  return "", fmt.Errorf("parent rule returned a leaf queue: %s", parentName)
}
{code}
{{info.GetQueue(parentName)}} returns nil, which causes the crash. 
Full stack trace:
{noformat}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x198b707]

goroutine 116 [running]:
github.com/apache/incubator-yunikorn-core/pkg/cache.(*QueueInfo).IsLeafQueue(...)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/cache/queue_info.go:198
github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*tagRule).placeApplication(0xc005d50050,
 0xc000494100, 0xc0006bc210, 0xc00644a300, 0x2, 0x2, 0x10502b1)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/placement/tag_rule.go:93
 +0xb47
github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*AppPlacementManager).PlaceApplication(0xc005d50000,
 0xc000494100, 0x0, 0x0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/placement/placement.go:141
 +0x485
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*partitionSchedulingContext).addSchedulingApplication(0xc0002e20e0,
 0xc005b36120, 0x0, 0x0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduling_partition.go:108
 +0x892
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterSchedulingContext).addSchedulingApplication(0xc000012000,
 0xc005b36120, 0x0, 0x0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduling_context.go:114
 +0x1d5
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).addNewApplication(0xc000390000,
 0xc000494100, 0xc000738121, 0x9)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:209
 +0x277
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).processApplicationUpdateEvent(0xc000390000,
 0xc00a7541e0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:447
 +0x9ec
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).handleSchedulerEvent(0xc000390000)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:596
 +0x40a
created by 
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).StartService
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:67
 +0x9e
{noformat}
I also attach the placement rule, but note that I was working on YUNIKORN-368, 
so the code is not 100% the same:
{noformat}
partitions:
  - name: default
    placementrules:
      - name: tag
        value: namespace
        create: true
        parent:
          name: tag
          value: "namespace.parentqueue"
          create: true
    queues:
      - name: root
        submitacl: '*'
        queues:
          - name: default
            submitacl: '*'
{noformat}
where the {{namespace.parentqueue}} is set to "root.special".

My proposal is that even if the queue does not exist, it shouldn't crash. Let's 
make a double check before doing getting the {{QueueInfo}} object.

  was:
The scheduler has crashed if the parent specified for the tag placement rule is 
not existing.
The bug is in this line:
{code:go}
if info.GetQueue(parentName).IsLeafQueue() {
  return "", fmt.Errorf("parent rule returned a leaf queue: %s", parentName)
}
{code}
{{info.GetQueue(parentName)}} returns nil, which causes the crash. 
Full stack trace:
{noformat}
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x198b707]

goroutine 116 [running]:
github.com/apache/incubator-yunikorn-core/pkg/cache.(*QueueInfo).IsLeafQueue(...)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/cache/queue_info.go:198
github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*tagRule).placeApplication(0xc005d50050,
 0xc000494100, 0xc0006bc210, 0xc00644a300, 0x2, 0x2, 0x10502b1)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/placement/tag_rule.go:93
 +0xb47
github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*AppPlacementManager).PlaceApplication(0xc005d50000,
 0xc000494100, 0x0, 0x0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/placement/placement.go:141
 +0x485
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*partitionSchedulingContext).addSchedulingApplication(0xc0002e20e0,
 0xc005b36120, 0x0, 0x0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduling_partition.go:108
 +0x892
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterSchedulingContext).addSchedulingApplication(0xc000012000,
 0xc005b36120, 0x0, 0x0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduling_context.go:114
 +0x1d5
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).addNewApplication(0xc000390000,
 0xc000494100, 0xc000738121, 0x9)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:209
 +0x277
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).processApplicationUpdateEvent(0xc000390000,
 0xc00a7541e0)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:447
 +0x9ec
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).handleSchedulerEvent(0xc000390000)
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:596
 +0x40a
created by 
github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).StartService
        
/Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:67
 +0x9e
{noformat}
I also attach the placement rule, but note that I was working on YUNIKORN-368, 
so the code is not 100% the same:
{noformat}
partitions:
  - name: default
    placementrules:
      - name: tag
        value: namespace
        create: true
        parent:
          name: tag
          value: "namespace.parentqueue"
          create: true
    queues:
      - name: root
        submitacl: '*'
        queues:
          - name: default
            submitacl: '*'
{noformat}
where the {{namespace.parentqueue}} is set to "root.special".

My proposal is that even if the queue does not exist, it shouldn't crash. Let's 
make a double check before doing getting the {{QueueInfo}} object.


> SIGSEGV if parent queue does not exist for tag rule
> ---------------------------------------------------
>
>                 Key: YUNIKORN-390
>                 URL: https://issues.apache.org/jira/browse/YUNIKORN-390
>             Project: Apache YuniKorn
>          Issue Type: Bug
>          Components: shim - kubernetes
>    Affects Versions: 0.10
>            Reporter: Adam Antal
>            Priority: Blocker
>
> The scheduler has crashed if the parent specified for the tag placement rule 
> is not existing.
> The bug is in this line ({{core/pkg/scheduler/placement/tag_rule.go#93}})
> {code:go}
> if info.GetQueue(parentName).IsLeafQueue() {
>   return "", fmt.Errorf("parent rule returned a leaf queue: %s", parentName)
> }
> {code}
> {{info.GetQueue(parentName)}} returns nil, which causes the crash. 
> Full stack trace:
> {noformat}
> panic: runtime error: invalid memory address or nil pointer dereference
> [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x198b707]
> goroutine 116 [running]:
> github.com/apache/incubator-yunikorn-core/pkg/cache.(*QueueInfo).IsLeafQueue(...)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/cache/queue_info.go:198
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*tagRule).placeApplication(0xc005d50050,
>  0xc000494100, 0xc0006bc210, 0xc00644a300, 0x2, 0x2, 0x10502b1)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/placement/tag_rule.go:93
>  +0xb47
> github.com/apache/incubator-yunikorn-core/pkg/scheduler/placement.(*AppPlacementManager).PlaceApplication(0xc005d50000,
>  0xc000494100, 0x0, 0x0)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/placement/placement.go:141
>  +0x485
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*partitionSchedulingContext).addSchedulingApplication(0xc0002e20e0,
>  0xc005b36120, 0x0, 0x0)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduling_partition.go:108
>  +0x892
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*ClusterSchedulingContext).addSchedulingApplication(0xc000012000,
>  0xc005b36120, 0x0, 0x0)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduling_context.go:114
>  +0x1d5
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).addNewApplication(0xc000390000,
>  0xc000494100, 0xc000738121, 0x9)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:209
>  +0x277
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).processApplicationUpdateEvent(0xc000390000,
>  0xc00a7541e0)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:447
>  +0x9ec
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).handleSchedulerEvent(0xc000390000)
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:596
>  +0x40a
> created by 
> github.com/apache/incubator-yunikorn-core/pkg/scheduler.(*Scheduler).StartService
>       
> /Users/adamantal/go/pkg/mod/github.com/apache/incubator-yunikorn-core@v0.0.0-20200827055746-57d663e73cb1/pkg/scheduler/scheduler.go:67
>  +0x9e
> {noformat}
> I also attach the placement rule, but note that I was working on 
> YUNIKORN-368, so the code is not 100% the same:
> {noformat}
> partitions:
>   - name: default
>     placementrules:
>       - name: tag
>         value: namespace
>         create: true
>         parent:
>           name: tag
>           value: "namespace.parentqueue"
>           create: true
>     queues:
>       - name: root
>         submitacl: '*'
>         queues:
>           - name: default
>             submitacl: '*'
> {noformat}
> where the {{namespace.parentqueue}} is set to "root.special".
> My proposal is that even if the queue does not exist, it shouldn't crash. 
> Let's make a double check before doing getting the {{QueueInfo}} object.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@yunikorn.apache.org
For additional commands, e-mail: issues-h...@yunikorn.apache.org

Reply via email to