[
https://issues.apache.org/jira/browse/YUNIKORN-1102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17500570#comment-17500570
]
Tseng Hsi-Huang commented on YUNIKORN-1102:
-------------------------------------------
After I modify _GetApplication_ from
{code:go}
func (ctx *Context) GetApplication(appID string) interfaces.ManagedApp {
ctx.lock.RLock()
defer ctx.lock.RUnlock()
if app, ok := ctx.applications[appID]; ok {
return app
}
return nil
}
{code}
to
{code:go}
func (ctx *Context) GetApplication(appID string) interfaces.ManagedApp {
ctx.lock.RLock()
defer ctx.lock.RUnlock()
return ctx.applications[appID]
}
{code}
and run
{code:sh}
make test
{code}
then I get the error message below
{code:}
cleaning up caches and output
go clean -cache -testcache -r -x ./... 2>&1 >/dev/null
rm -rf _output queues.yaml k8s_yunikorn_scheduler \
./deployments/image/configmap/k8s_yunikorn_scheduler \
./deployments/image/configmap/queues.yaml \
./deployments/image/admission/scheduler-admission-controller
running unit tests
go test ./pkg/... -cover -race -tags deadlock -coverprofile=coverage.txt
-covermode=atomic
?
github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/apis/yunikorn.apache.org/v1alpha1
[no test files]
2022-03-03T15:57:44.774+0800 INFO log/logger.go:89 scheduler
configuration, pretty print \{"configs": "{\n \"schedulerName\":
\"yunikorn\",\n \"clusterId\": \"my-kube-cluster\",\n \"clusterVersion\":
\"0.1\",\n \"policyGroup\": \"queues\",\n \"schedulingIntervalSecond\":
1000000000,\n \"absoluteKubeConfigFilePath\": \"\",\n \"loggingLevel\": 0,\n
\"logEncoding\": \"console\",\n \"logFilePath\": \"\",\n \"volumeBindTimeout\":
10000000000,\n \"testMode\": false,\n \"eventChannelCapacity\": 1048576,\n
\"dispatchTimeout\": 300000000000,\n \"kubeQPS\": 1000,\n \"kubeBurst\":
1000,\n \"predicates\": \"\",\n \"operatorPlugins\": \"mocked-app-manager\",\n
\"enableConfigHotRefresh\": false,\n \"disableGangScheduling\": false,\n
\"userLabelKey\": \"yunikorn.apache.org/username\"\n}"}
2022-03-03T15:57:44.774+0800 INFO appmgmt/appmgmt.go:50 Initializing
new AppMgmt service
2022-03-03T15:57:44.774+0800 INFO appmgmt/appmgmt.go:83 registering app
management service \{"serviceName": "mocked-app-manager"}
2022-03-03T15:57:44.774+0800 INFO appmgmt/appmgmt_recovery.go:46
Starting app recovery
2022-03-03T15:57:44.774+0800 INFO cache/application.go:466 handle
app recovering \{"app": "applicationID: app01, queue: root.a, partition:
default, totalNumOfTasks: 0, currentState: Recovering", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:44.774+0800 INFO cache/application.go:466 handle
app recovering \{"app": "applicationID: app02, queue: root.a, partition:
default, totalNumOfTasks: 0, currentState: Recovering", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:44.775+0800 INFO appmgmt/appmgmt.go:50 Initializing
new AppMgmt service
2022-03-03T15:57:44.775+0800 INFO appmgmt/appmgmt.go:83 registering app
management service \{"serviceName": "mocked-app-manager"}
2022-03-03T15:57:44.775+0800 INFO appmgmt/appmgmt_recovery.go:46
Starting app recovery
2022-03-03T15:57:44.775+0800 INFO cache/application.go:466 handle
app recovering \{"app": "applicationID: app01, queue: root.a, partition:
default, totalNumOfTasks: 0, currentState: Recovering", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:44.775+0800 INFO cache/application.go:466 handle
app recovering \{"app": "applicationID: app02, queue: root.a, partition:
default, totalNumOfTasks: 0, currentState: Recovering", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:44.775+0800 INFO appmgmt/appmgmt_recovery.go:79 wait
for app recovery \{"appToRecover": 2}
2022-03-03T15:57:47.777+0800 INFO appmgmt/appmgmt.go:50 Initializing
new AppMgmt service
2022-03-03T15:57:47.777+0800 INFO appmgmt/appmgmt.go:83 registering app
management service \{"serviceName": "mocked-app-manager"}
2022-03-03T15:57:47.777+0800 INFO appmgmt/appmgmt_recovery.go:46
Starting app recovery
2022-03-03T15:57:47.777+0800 INFO cache/application.go:466 handle
app recovering \{"app": "applicationID: app01, queue: root.a, partition:
default, totalNumOfTasks: 0, currentState: Recovering", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:47.777+0800 INFO cache/application.go:466 handle
app recovering \{"app": "applicationID: app02, queue: root.a, partition:
default, totalNumOfTasks: 0, currentState: Recovering", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:47.777+0800 INFO appmgmt/appmgmt_recovery.go:79 wait
for app recovery \{"appToRecover": 2}
2022-03-03T15:57:47.777+0800 INFO appmgmt/appmgmt_recovery.go:93 app
recovery is successful
2022-03-03T15:57:47.779+0800 INFO dispatcher/dispatcher.go:80 Init
dispatcher \{"EventChannelCapacity": 1048576, "AsyncDispatchLimit": 104857,
"DispatchTimeoutInSeconds": 300}
2022-03-03T15:57:47.779+0800 INFO dispatcher/dispatcher.go:183
starting the dispatcher
2022-03-03T15:57:47.779+0800 INFO appmgmt/appmgmt.go:50 Initializing
new AppMgmt service
2022-03-03T15:57:47.779+0800 INFO appmgmt/appmgmt.go:83 registering app
management service \{"serviceName": "mocked-app-manager"}
2022-03-03T15:57:47.779+0800 INFO appmgmt/appmgmt_recovery.go:46
Starting app recovery
2022-03-03T15:57:47.779+0800 INFO dispatcher/dispatcher.go:215
stopping the dispatcher
2022-03-03T15:57:47.780+0800 INFO dispatcher/dispatcher.go:229
dispatcher is already stopped
--- FAIL: TestAppStatesDuringRecovery (0.00s)
panic: runtime error: invalid memory address or nil pointer dereference
[recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1ef0749]
goroutine 39 [running]:
testing.tRunner.func1.2(\{0x20c6560, 0x339b070})
/usr/local/go/src/testing/testing.go:1209 +0x36c
testing.tRunner.func1()
/usr/local/go/src/testing/testing.go:1212 +0x3b6
panic(\{0x20c6560, 0x339b070})
/usr/local/go/src/runtime/panic.go:1047 +0x266
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Application).GetApplicationID(0x0)
/home/lab/incubator-yunikorn-k8shim/pkg/cache/application.go:206 +0x49
github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt.(*AppManagementService).recoverApps(0xc0004f2080)
/home/lab/incubator-yunikorn-k8shim/pkg/appmgmt/appmgmt_recovery.go:63
+0x55d
github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt.TestAppStatesDuringRecovery(0x0)
/home/lab/incubator-yunikorn-k8shim/pkg/appmgmt/appmgmt_recovery_test.go:105
+0x1b2
testing.tRunner(0xc0002936c0, 0x2398338)
/usr/local/go/src/testing/testing.go:1259 +0x230
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:1306 +0x727
FAIL github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt 3.079s
ok github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt/general 0.108s
coverage: 75.0% of statements
? github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt/interfaces
[no test files]
? github.com/apache/incubator-yunikorn-k8shim/pkg/appmgmt/sparkoperator
[no test files]
2022-03-03T15:57:44.813+0800 INFO log/logger.go:89 scheduler
configuration, pretty print \{"configs": "{\n \"schedulerName\":
\"yunikorn\",\n \"clusterId\": \"my-kube-cluster\",\n \"clusterVersion\":
\"0.1\",\n \"policyGroup\": \"queues\",\n \"schedulingIntervalSecond\":
1000000000,\n \"absoluteKubeConfigFilePath\": \"\",\n \"loggingLevel\": 0,\n
\"logEncoding\": \"console\",\n \"logFilePath\": \"\",\n \"volumeBindTimeout\":
10000000000,\n \"testMode\": false,\n \"eventChannelCapacity\": 1048576,\n
\"dispatchTimeout\": 300000000000,\n \"kubeQPS\": 1000,\n \"kubeBurst\":
1000,\n \"predicates\": \"\",\n \"operatorPlugins\":
\"general,yunikorn-app\",\n \"enableConfigHotRefresh\": false,\n
\"disableGangScheduling\": false,\n \"userLabelKey\":
\"yunikorn.apache.org/username\"\n}"}
2022-03-03T15:57:44.813+0800 INFO cache/application.go:436 handle
app submission \{"app": "applicationID: app00001, queue: root.abc, partition:
default, totalNumOfTasks: 0, currentState: Submitted", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:44.813+0800 INFO cache/application.go:436 handle
app submission \{"app": "applicationID: app00001, queue: root.abc, partition:
default, totalNumOfTasks: 0, currentState: Submitted", "clusterID":
"my-kube-cluster"}
2022-03-03T15:57:44.815+0800 INFO dispatcher/dispatcher.go:80 Init
dispatcher \{"EventChannelCapacity": 1048576, "AsyncDispatchLimit": 104857,
"DispatchTimeoutInSeconds": 300}
2022-03-03T15:57:44.815+0800 INFO dispatcher/dispatcher.go:183
starting the dispatcher
2022-03-03T15:57:44.815+0800 INFO cache/placeholder_manager.go:144
starting the PlaceholderManager
2022-03-03T15:57:44.815+0800 INFO cache/application.go:621
failApplication reason \{"applicationID": "app-test-001", "errMsg": "Test
Error Message"}
2022-03-03T15:57:44.816+0800 INFO cache/application.go:621
failApplication reason \{"applicationID": "app-test-001", "errMsg": "Test
Error Message"}
2022-03-03T15:57:44.816+0800 INFO cache/application.go:621
failApplication reason \{"applicationID": "app-test-002", "errMsg": "Test
Error Message"}
2022-03-03T15:57:44.816+0800 INFO cache/application.go:621
failApplication reason \{"applicationID": "app-test-002", "errMsg": "Test
Error Message"}
2022-03-03T15:57:44.816+0800 INFO cache/placeholder_manager.go:166
stopping the PlaceholderManager
2022-03-03T15:57:44.816+0800 INFO dispatcher/dispatcher.go:215
stopping the dispatcher
2022-03-03T15:57:44.816+0800 INFO dispatcher/dispatcher.go:220 waiting
for dispatcher to be stopped \{"remainingSeconds": 5}
2022-03-03T15:57:44.816+0800 INFO dispatcher/dispatcher.go:204
shutting down event channel
2022-03-03T15:57:44.816+0800 INFO cache/placeholder_manager.go:104
start to clean up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:44.816+0800 INFO cache/placeholder_manager.go:119
finished cleaning up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:44.816+0800 INFO cache/placeholder_manager.go:104
start to clean up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:44.816+0800 INFO cache/placeholder_manager.go:119
finished cleaning up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:44.816+0800 INFO cache/placeholder_manager.go:104
start to clean up app placeholders \{"appID": "app-test-002"}
2022-03-03T15:57:44.816+0800 INFO cache/placeholder_manager.go:119
finished cleaning up app placeholders \{"appID": "app-test-002"}
2022-03-03T15:57:44.817+0800 INFO cache/placeholder_manager.go:104
start to clean up app placeholders \{"appID": "app-test-002"}
2022-03-03T15:57:44.817+0800 INFO cache/placeholder_manager.go:119
finished cleaning up app placeholders \{"appID": "app-test-002"}
2022-03-03T15:57:44.817+0800 INFO cache/placeholder_manager.go:152
PlaceholderManager has been stopped
2022-03-03T15:57:45.817+0800 INFO dispatcher/dispatcher.go:226
dispatcher stopped
2022-03-03T15:57:45.817+0800 INFO dispatcher/dispatcher.go:183
starting the dispatcher
2022-03-03T15:57:45.818+0800 INFO cache/placeholder_manager.go:144
starting the PlaceholderManager
2022-03-03T15:57:45.818+0800 INFO cache/application.go:621
failApplication reason \{"applicationID": "app-test-001", "errMsg":
"ResourceReservationTimeout"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00001"}
2022-03-03T15:57:45.818+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00001"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00002"}
2022-03-03T15:57:45.818+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00002"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00003"}
2022-03-03T15:57:45.818+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00003"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:621
failApplication reason \{"applicationID": "app-test-001", "errMsg":
"ResourceReservationTimeout"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00001"}
2022-03-03T15:57:45.818+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00001"}
2022-03-03T15:57:45.818+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:45.819+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00002"}
2022-03-03T15:57:45.819+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00002"}
2022-03-03T15:57:45.819+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:45.819+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00003"}
2022-03-03T15:57:45.819+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00003"}
2022-03-03T15:57:45.819+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:45.819+0800 INFO cache/placeholder_manager.go:166
stopping the PlaceholderManager
2022-03-03T15:57:45.819+0800 INFO cache/placeholder_manager.go:104
start to clean up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:45.819+0800 INFO client/kubeclient_mock.go:55 pod
deleted \{"PodName": "pod-test-00002"}
2022-03-03T15:57:45.819+0800 INFO cache/placeholder_manager.go:119
finished cleaning up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:45.819+0800 INFO cache/placeholder_manager.go:152
PlaceholderManager has been stopped
2022-03-03T15:57:45.819+0800 INFO cache/placeholder_manager.go:104
start to clean up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:45.819+0800 INFO client/kubeclient_mock.go:55 pod
deleted \{"PodName": "pod-test-00002"}
2022-03-03T15:57:45.819+0800 INFO cache/placeholder_manager.go:119
finished cleaning up app placeholders \{"appID": "app-test-001"}
2022-03-03T15:57:45.819+0800 INFO dispatcher/dispatcher.go:215
stopping the dispatcher
2022-03-03T15:57:45.819+0800 INFO dispatcher/dispatcher.go:220 waiting
for dispatcher to be stopped \{"remainingSeconds": 5}
2022-03-03T15:57:45.819+0800 INFO dispatcher/dispatcher.go:204
shutting down event channel
2022-03-03T15:57:46.820+0800 INFO dispatcher/dispatcher.go:226
dispatcher stopped
2022-03-03T15:57:46.820+0800 INFO dispatcher/dispatcher.go:183
starting the dispatcher
2022-03-03T15:57:46.820+0800 INFO cache/placeholder_manager.go:144
starting the PlaceholderManager
2022-03-03T15:57:46.821+0800 INFO cache/application.go:576 app is
rejected by scheduler \{"appID": "app-test-001"}
2022-03-03T15:57:46.821+0800 INFO cache/application.go:621
failApplication reason \{"applicationID": "app-test-001", "errMsg":
"ApplicationRejected: app rejected"}
2022-03-03T15:57:46.821+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00002"}
2022-03-03T15:57:46.821+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00002"}
2022-03-03T15:57:46.821+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:46.821+0800 INFO cache/application.go:602 setting
pod to failed \{"podName": "pod-test-00001"}
2022-03-03T15:57:46.821+0800 INFO client/kubeclient_mock.go:65 pod
status updated \{"PodName": "pod-test-00001"}
2022-03-03T15:57:46.821+0800 INFO cache/application.go:607 new pod
status \{"status": "Failed"}
2022-03-03T15:57:46.821+0800 INFO cache/placeholder_manager.go:166
stopping the PlaceholderManager
2022-03-03T15:57:46.821+0800 INFO cache/placeholder_manager.go:152
PlaceholderManager has been stopped
2022-03-03T15:57:46.821+0800 INFO dispatcher/dispatcher.go:215
stopping the dispatcher
2022-03-03T15:57:46.821+0800 INFO dispatcher/dispatcher.go:229
dispatcher is already stopped
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1f256ad]
goroutine 168 [running]:
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Application).canHandle(0x0,
\{0x2608e30, 0xc000225710})
/home/lab/incubator-yunikorn-k8shim/pkg/cache/application.go:190 +0x6d
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Context).ApplicationEventHandler.func1(\{0x2221640,
0xc000225710})
/home/lab/incubator-yunikorn-k8shim/pkg/cache/context.go:855 +0x13a
github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start.func1()
/home/lab/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:192
+0x28c
created by github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher.Start
/home/lab/incubator-yunikorn-k8shim/pkg/dispatcher/dispatcher.go:184
+0x73
FAIL github.com/apache/incubator-yunikorn-k8shim/pkg/cache 2.094s
ok github.com/apache/incubator-yunikorn-k8shim/pkg/cache/external 0.145s
coverage: 47.3% of statements
? github.com/apache/incubator-yunikorn-k8shim/pkg/callback [no
test files]
? github.com/apache/incubator-yunikorn-k8shim/pkg/client [no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/clientset/versioned
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/clientset/versioned/fake
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/clientset/versioned/scheme
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/clientset/versioned/typed/yunikorn.apache.org/v1alpha1
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/clientset/versioned/typed/yunikorn.apache.org/v1alpha1/fake
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/informers/externalversions
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/informers/externalversions/internalinterfaces
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/informers/externalversions/yunikorn.apache.org
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/informers/externalversions/yunikorn.apache.org/v1alpha1
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/client/listers/yunikorn.apache.org/v1alpha1
[no test files]
? github.com/apache/incubator-yunikorn-k8shim/pkg/cmd/schedulerplugin
[no test files]
? github.com/apache/incubator-yunikorn-k8shim/pkg/cmd/shim [no
test files]
ok github.com/apache/incubator-yunikorn-k8shim/pkg/common 0.114s
coverage: 83.8% of statements
? github.com/apache/incubator-yunikorn-k8shim/pkg/common/constants
[no test files]
ok github.com/apache/incubator-yunikorn-k8shim/pkg/common/events 0.115s
coverage: 20.0% of statements
? github.com/apache/incubator-yunikorn-k8shim/pkg/common/test [no
test files]
ok github.com/apache/incubator-yunikorn-k8shim/pkg/common/utils 0.052s
coverage: 75.0% of statements
ok github.com/apache/incubator-yunikorn-k8shim/pkg/conf 0.030s
coverage: 48.9% of statements
ok github.com/apache/incubator-yunikorn-k8shim/pkg/controller/application
0.085s coverage: 81.3% of statements
ok github.com/apache/incubator-yunikorn-k8shim/pkg/dispatcher 9.747s
coverage: 90.4% of statements
? github.com/apache/incubator-yunikorn-k8shim/pkg/log [no test files]
ok github.com/apache/incubator-yunikorn-k8shim/pkg/pki 30.984s
coverage: 69.6% of statements
ok
github.com/apache/incubator-yunikorn-k8shim/pkg/plugin/admissioncontrollers/webhook
22.457s coverage: 69.5% of statements
ok github.com/apache/incubator-yunikorn-k8shim/pkg/plugin/predicates
0.110s coverage: 76.0% of statements
ok github.com/apache/incubator-yunikorn-k8shim/pkg/plugin/support 0.072s
coverage: 78.4% of statements
? github.com/apache/incubator-yunikorn-k8shim/pkg/schedulerplugin [no
test files]
ok github.com/apache/incubator-yunikorn-k8shim/pkg/schedulerplugin/conf
0.036s coverage: 87.0% of statements
2022-03-03T15:57:51.390+0800 INFO entrypoint/entrypoint.go:43
ServiceContext start all services
2022-03-03T15:57:51.392+0800 INFO entrypoint/entrypoint.go:89
ServiceContext start scheduling services
2022-03-03T15:57:51.393+0800 INFO entrypoint/entrypoint.go:106
creating InternalMetricsHistory
2022-03-03T15:57:51.393+0800 INFO entrypoint/entrypoint.go:113
ServiceContext start web application service
2022-03-03T15:57:51.396+0800 INFO webservice/webservice.go:71 web-app
started \{"port": 9080}
2022-03-03T15:57:51.396+0800 INFO log/logger.go:89 scheduler
configuration, pretty print \{"configs": "{\n \"schedulerName\":
\"yunikorn\",\n \"clusterId\": \"my-kube-cluster\",\n \"clusterVersion\":
\"0.1\",\n \"policyGroup\": \"queues\",\n \"schedulingIntervalSecond\":
1000000000,\n \"absoluteKubeConfigFilePath\": \"\",\n \"loggingLevel\": 0,\n
\"logEncoding\": \"console\",\n \"logFilePath\": \"\",\n \"volumeBindTimeout\":
10000000000,\n \"testMode\": true,\n \"eventChannelCapacity\": 1048576,\n
\"dispatchTimeout\": 300000000000,\n \"kubeQPS\": 1000,\n \"kubeBurst\":
1000,\n \"predicates\": \"\",\n \"operatorPlugins\":
\"general,yunikorn-app\",\n \"enableConfigHotRefresh\": false,\n
\"disableGangScheduling\": false,\n \"userLabelKey\":
\"yunikorn.apache.org/username\"\n}"}
2022-03-03T15:57:51.396+0800 INFO appmgmt/appmgmt.go:50 Initializing
new AppMgmt service
2022-03-03T15:57:51.397+0800 INFO dispatcher/dispatcher.go:80 Init
dispatcher \{"EventChannelCapacity": 1048576, "AsyncDispatchLimit": 104857,
"DispatchTimeoutInSeconds": 300}
2022-03-03T15:57:51.397+0800 WARN appmgmt/appmgmt.go:133 App manager is
not registered \{"app manager name": "yunikorn-app"}
2022-03-03T15:57:51.397+0800 INFO dispatcher/dispatcher.go:183
starting the dispatcher
2022-03-03T15:57:51.397+0800 INFO cache/placeholder_manager.go:144
starting the PlaceholderManager
2022-03-03T15:57:51.397+0800 INFO shim/scheduler_mock_test.go:119 waiting
for scheduler state \{"expected": "Running", "actual": "New"}
2022-03-03T15:57:51.398+0800 INFO shim/scheduler.go:258 register RM to
the scheduler \{"clusterID": "my-kube-cluster", "clusterVersion": "0.1",
"policyGroup": "queues", "buildInfo":
{"buildDate":"","buildVersion":"","isPluginVersion":"false"}}
2022-03-03T15:57:51.398+0800 DEBUG scheduler/scheduler.go:97
enqueued event \{"eventType": "*rmevent.RMRegistrationEvent", "eventError":
"json: unsupported type: chan *rmevent.Result", "currentQueueSize": 0}
2022-03-03T15:57:51.398+0800 DEBUG configs/configvalidator.go:331
checking partition queue config \{"partitionName": "default"}
2022-03-03T15:57:51.398+0800 INFO scheduler/context.go:343 added
partitions \{"partitionName": "[my-kube-cluster]default"}
2022-03-03T15:57:51.398+0800 INFO security/acl.go:63 user list is
wildcard, allowing all access
2022-03-03T15:57:51.398+0800 INFO objects/queue.go:117 configured
queue added to scheduler \{"queueName": "root"}
2022-03-03T15:57:51.398+0800 INFO objects/queue.go:117 configured
queue added to scheduler \{"queueName": "root.a"}
2022-03-03T15:57:51.398+0800 INFO scheduler/partition.go:121 root
queue added \{"partitionName": "[my-kube-cluster]default", "rmID":
"my-kube-cluster"}
2022-03-03T15:57:51.399+0800 INFO security/usergroup.go:79
creating UserGroupCache without resolver
2022-03-03T15:57:51.399+0800 INFO security/usergroup.go:83
starting UserGroupCache cleaner \{"cleanerInterval": "1m0s"}
2022-03-03T15:57:51.399+0800 INFO scheduler/partition.go:149
NodeSorting policy set from config \{"policyName": "fair"}
2022-03-03T15:57:51.399+0800 DEBUG objects/nodesorting.go:140 new
node sorting policy added \{"type": "fair", "resourceWeights":
{"memory":1,"vcore":1}}
2022-03-03T15:57:51.399+0800 INFO plugins/plugins.go:36 register
scheduler plugin: ResourceManagerCallback
2022-03-03T15:57:51.399+0800 INFO scheduler/partition_manager.go:62
starting partition manager \{"partition": "[my-kube-cluster]default",
"cleanRootInterval": "10s"}
2022-03-03T15:57:51.399+0800 INFO shim/scheduler.go:192 recovering
scheduler states
2022-03-03T15:57:51.399+0800 INFO shim/scheduler.go:226 scheduler
recovery succeed
2022-03-03T15:57:51.400+0800 INFO shim/scheduler.go:356 No outstanding
apps found for a while \{"timeout": "2m0s"}
2022-03-03T15:57:52.393+0800 DEBUG scheduler/scheduler.go:137 inspect
outstanding requests
report new nodes to scheduler, request: nodes:<nodeID:"test.host.01"
action:CREATE attributes:<key:"si.io/hostname" value:"test.host.01" >
attributes:<key:"si.io/rackname" value:"/rack-default" >
schedulableResource:<resources:<key:"memory" value:<value:100 > >
resources:<key:"vcore" value:<value:10 > > > > rmID:"my-kube-cluster" report
new nodes to scheduler, request: nodes:<nodeID:"test.host.02" action:CREATE
attributes:<key:"si.io/hostname" value:"test.host.02" >
attributes:<key:"si.io/rackname" value:"/rack-default" >
schedulableResource:<resources:<key:"memory" value:<value:100 > >
resources:<key:"vcore" value:<value:10 > > > > rmID:"my-kube-cluster"
2022-03-03T15:57:52.399+0800 INFO scheduler/partition.go:566 adding
node to partition \{"partition": "[my-kube-cluster]default", "nodeID":
"test.host.01"}
2022-03-03T15:57:52.400+0800 INFO shim/scheduler.go:340 stopping
scheduler
2022-03-03T15:57:52.400+0800 INFO dispatcher/dispatcher.go:215
stopping the dispatcher
2022-03-03T15:57:52.400+0800 INFO objects/queue.go:1050 updating root
queue max resources \{"current max": "nil resource", "new max":
"map[memory:100 vcore:10]"}
2022-03-03T15:57:52.400+0800 INFO dispatcher/dispatcher.go:220 waiting
for dispatcher to be stopped \{"remainingSeconds": 5}
2022-03-03T15:57:52.400+0800 INFO scheduler/partition.go:631 Updated
available resources from added node \{"partitionName":
"[my-kube-cluster]default", "nodeID": "test.host.01", "partitionResource":
"map[memory:100 vcore:10]"}
2022-03-03T15:57:52.400+0800 INFO dispatcher/dispatcher.go:204
shutting down event channel
2022-03-03T15:57:52.400+0800 INFO scheduler/context.go:595
successfully added node \{"nodeID": "test.host.01", "partition":
"[my-kube-cluster]default"}
2022-03-03T15:57:52.400+0800 DEBUG rmproxy/rmproxy.go:64 enqueue event
\{"event":
{"RmID":"my-kube-cluster","AcceptedNodes":[{"nodeID":"test.host.01"}],"RejectedNodes":[]},
"currentQueueSize": 0}
2022-03-03T15:57:52.400+0800 WARN dispatcher/dispatcher.go:116 failed
to dispatch SchedulingEvent \{"error": "dispatcher is not running"}
2022-03-03T15:57:52.400+0800 INFO scheduler/partition.go:566 adding
node to partition \{"partition": "[my-kube-cluster]default", "nodeID":
"test.host.02"}
2022-03-03T15:57:52.399+0800 DEBUG scheduler/scheduler.go:97
enqueued event \{"eventType": "*rmevent.RMUpdateNodeEvent", "event":
{"Request":{"nodes":[{"nodeID":"test.host.01","action":1,"attributes":{"si.io/hostname":"test.host.01","si.io/rackname":"/rack-default","si/node-partition":"[my-kube-cluster]default"},"schedulableResource":\{"resources":{"memory":{"value":100},"vcore":\{"value":10}}}}],"rmID":"my-kube-cluster"}},
"currentQueueSize": 0}
2022-03-03T15:57:52.400+0800 DEBUG scheduler/scheduler.go:97
enqueued event \{"eventType": "*rmevent.RMUpdateNodeEvent", "event":
{"Request":{"nodes":[{"nodeID":"test.host.02","action":1,"attributes":{"si.io/hostname":"test.host.02","si.io/rackname":"/rack-default","si/node-partition":"[my-kube-cluster]default"},"schedulableResource":\{"resources":{"memory":{"value":100},"vcore":\{"value":10}}}}],"rmID":"my-kube-cluster"}},
"currentQueueSize": 1}
2022-03-03T15:57:52.400+0800 INFO objects/queue.go:1050 updating root
queue max resources \{"current max": "map[memory:100 vcore:10]", "new
max": "map[memory:200 vcore:20]"}
2022-03-03T15:57:52.400+0800 INFO scheduler/partition.go:631 Updated
available resources from added node \{"partitionName":
"[my-kube-cluster]default", "nodeID": "test.host.02", "partitionResource":
"map[memory:200 vcore:20]"}
2022-03-03T15:57:52.400+0800 INFO scheduler/context.go:595
successfully added node \{"nodeID": "test.host.02", "partition":
"[my-kube-cluster]default"}
2022-03-03T15:57:52.400+0800 DEBUG rmproxy/rmproxy.go:64 enqueue event
\{"event":
{"RmID":"my-kube-cluster","AcceptedNodes":[{"nodeID":"test.host.02"}],"RejectedNodes":[]},
"currentQueueSize": 0}
2022-03-03T15:57:52.400+0800 WARN dispatcher/dispatcher.go:116 failed
to dispatch SchedulingEvent \{"error": "dispatcher is not running"}
2022-03-03T15:57:53.393+0800 DEBUG scheduler/scheduler.go:137 inspect
outstanding requests
2022-03-03T15:57:53.400+0800 INFO dispatcher/dispatcher.go:226
dispatcher stopped
2022-03-03T15:57:53.400+0800 INFO appmgmt/appmgmt.go:120 shutting down
app management services
2022-03-03T15:57:53.400+0800 INFO cache/placeholder_manager.go:166
stopping the PlaceholderManager
--- FAIL: TestApplicationScheduling (2.01s)
2022-03-03T15:57:53.400+0800 INFO cache/placeholder_manager.go:152
PlaceholderManager has been stopped
panic: runtime error: invalid memory address or nil pointer dereference
[recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x1f63205]
goroutine 66 [running]:
testing.tRunner.func1.2(\{0x2256e80, 0x362ac10})
/usr/local/go/src/testing/testing.go:1209 +0x36c
testing.tRunner.func1()
/usr/local/go/src/testing/testing.go:1212 +0x3b6
panic(\{0x2256e80, 0x362ac10})
/usr/local/go/src/runtime/panic.go:1047 +0x266
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Application).GetTask(0x0,
\{0x24865fc, 0x8})
/home/lab/incubator-yunikorn-k8shim/pkg/cache/application.go:196 +0x65
github.com/apache/incubator-yunikorn-k8shim/pkg/cache.(*Context).AddTask(0x223ef00,
0xc0005c5d78)
/home/lab/incubator-yunikorn-k8shim/pkg/cache/context.go:678 +0x31b
github.com/apache/incubator-yunikorn-k8shim/pkg/shim.(*MockScheduler).addTask(...)
/home/lab/incubator-yunikorn-k8shim/pkg/shim/scheduler_mock_test.go:100
github.com/apache/incubator-yunikorn-k8shim/pkg/shim.TestApplicationScheduling(0x0)
/home/lab/incubator-yunikorn-k8shim/pkg/shim/scheduler_test.go:80 +0x79c
testing.tRunner(0xc000582ea0, 0x2551fe0)
/usr/local/go/src/testing/testing.go:1259 +0x230
created by testing.(*T).Run
/usr/local/go/src/testing/testing.go:1306 +0x727
FAIL github.com/apache/incubator-yunikorn-k8shim/pkg/shim 2.077s
?
github.com/apache/incubator-yunikorn-k8shim/pkg/simulation/gang/gangclient
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/simulation/gang/webserver
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/clientset/versioned
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/clientset/versioned/fake
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/clientset/versioned/scheme
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/clientset/versioned/typed/sparkoperator.k8s.io/v1beta2
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/clientset/versioned/typed/sparkoperator.k8s.io/v1beta2/fake
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/informers/externalversions
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/informers/externalversions/internalinterfaces
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/informers/externalversions/sparkoperator.k8s.io
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/informers/externalversions/sparkoperator.k8s.io/v1beta2
[no test files]
?
github.com/apache/incubator-yunikorn-k8shim/pkg/sparkclient/listers/sparkoperator.k8s.io/v1beta2
[no test files]
FAIL
make: *** [Makefile:296: test] Error 1
{code}
so I think there is some problem with the simplification
> shim context getTask error is ignored
> -------------------------------------
>
> Key: YUNIKORN-1102
> URL: https://issues.apache.org/jira/browse/YUNIKORN-1102
> Project: Apache YuniKorn
> Issue Type: Improvement
> Components: shim - kubernetes
> Reporter: Wilfred Spiegelenburg
> Assignee: Tseng Hsi-Huang
> Priority: Major
> Labels: newbie
>
> The {{context.getTask()}} call returns a {{Task}} struct and an error. In all
> cases either the task is nil or the error is nil. The error returned is also
> fixed and does not provide any detail not known without it (i.e. "application
> is not found" which only partially covers the cases)
> We need to bring the call in line with _GetApplication_ and the _GetNode_ in
> the cache and just return the task and handle the nil return.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]