Alexander Rukletsov created MESOS-9395: ------------------------------------------
Summary: Check failure on Key: MESOS-9395 URL: https://issues.apache.org/jira/browse/MESOS-9395 Project: Mesos Issue Type: Bug Components: resource provider Affects Versions: 1.7.0 Reporter: Alexander Rukletsov Observed the following agent failure on one of our staging clusters: {noformat} Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: I1116 11:57:24.641331 26684 http.cpp:1799] Processing GET_AGENT call Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: I1116 11:57:24.650429 26679 http.cpp:1117] HTTP POST for /slave(1)/api/v1/resource_provider from 172.31.8.65:57790 Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: I1116 11:57:24.650629 26679 manager.cpp:672] Subscribing resource provider {"attributes":[{"name":"lvm-vg-name","text":{"value":"lvm-double-1540383639"},"type":"SCALAR"},{"name":"dss-asset-id","text":{"value":"6AbZV6W2DrK4YgcIR3ICVo"},"type":"SCALAR"}],"default_reservations":[{"principal":"storage-principal","role":"dcos-storage","type":"DYNAMIC"}],"id":{"value":"8326e931-41f2-4f45-9174-13fe35c19300"},"name":"rp_6AbZV6W2DrK4YgcIR3ICVo","storage":{"plugin":{"containers":[{"command":{"environment":{"variables":[{"name":"PATH","type":"VALUE","value":"/opt/mesosphere/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"},{"name":"LD_LIBRARY_PATH","type":"VALUE","value":"/opt/mesosphere/lib"},{"name":"CONTAINER_LOGGER_DESTINATION_TYPE","type":"VALUE","value":"journald+logrotate"},{"name":"CONTAINER_LOGGER_EXTRA_LABELS","type":"VALUE","value":"{\"CSI_PLUGIN\":\"csilvm\"}"}]},"shell":true,"uris":[{"executable":true,"extract":false,"value":"<possibly-sensitive>"}],"value":"echo \"a *:* rwm\" > /sys/fs/cgroup/devices`cat /proc/self/cgroup | grep devices | cut -d : -f 3`/devices.allow; exec ./csilvm -devices=/dev/xvdk,/dev/xvdj -volume-group=lvm-double-1540383639 -unix-addr-env=CSI_ENDPOINT -tag=6AbZV6W2DrK4YgcIR3ICVo"},"resources":[{"name":"cpus","scalar":{"value":0.1},"type":"SCALAR"},{"name":"mem","scalar":{"value":128.0},"type":"SCALAR"},{"name":"disk","scalar":{"value":10.0},"type":"SCALAR"}],"services":["CONTROLLER_SERVICE","NODE_SERVICE"]}],"name":"plugin_6AbZV6W2DrK4YgcIR3ICVo","type":"io.mesosphere.dcos.storage.csilvm"}},"type":"org.apache.mesos.rp.local.storage"} Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: I1116 11:57:24.690474 26685 provider.cpp:546] Received SUBSCRIBED event Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: I1116 11:57:24.690521 26685 provider.cpp:1492] Subscribed with ID 8326e931-41f2-4f45-9174-13fe35c19300 Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: I1116 11:57:24.690657 26681 status_update_manager_process.hpp:314] Recovering operation status update manager Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: F1116 11:57:24.691496 26682 provider.cpp:3121] Check failed: resource.disk().source().has_profile() != resource.disk().source().has_id() (1 vs. 1) Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: *** Check failure stack trace: *** Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb099e9fd google::LogMessage::Fail() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb09a082d google::LogMessage::SendToLog() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb099e5ec google::LogMessage::Flush() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb09a1129 google::LogMessageFatal::~LogMessageFatal() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb01654ca mesos::internal::StorageLocalResourceProviderProcess::applyCreateDisk() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb017c683 mesos::internal::StorageLocalResourceProviderProcess::_applyOperation() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb017d64a _ZZN5mesos8internal35StorageLocalResourceProviderProcess26reconcileOperationStatusesEvENKUlRKNS0_26StatusUpdateManagerProcessIN2id4UUIDENS0_27UpdateOperationStatusRecordENS0_28UpdateOperationStatusMessageEE5StateEE_clESA_ Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb017dd21 _ZNO6lambda12CallableOnceIFN7process6FutureI7NothingEEvEE10CallableFnINS_8internal7PartialIZN5mesos8internal35StorageLocalResourceProviderProcess26reconcileOperationStatusesEvEUlRKNSB_26StatusUpdateManagerProcessIN2id4UUIDENSB_27UpdateOperationStatusRecordENSB_28UpdateOperationStatusMessageEE5StateEE_ISJ_EEEEclEv Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecafa0ce97 _ZNO6lambda12CallableOnceIFvPN7process11ProcessBaseEEE10CallableFnINS_8internal7PartialIZNS1_8internal8DispatchINS1_6FutureI7NothingEEEclINS0_IFSD_vEEEEESD_RKNS1_4UPIDEOT_EUlSt10unique_ptrINS1_7PromiseISC_EESt14default_deleteISP_EEOSH_S3_E_JSS_SH_St12_PlaceholderILi1EEEEEEclEOS3_ Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb08eec51 process::ProcessBase::consume() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb09056cc process::ProcessManager::resume() Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecb090b186 _ZNSt6thread5_ImplISt12_Bind_simpleIFZN7process14ProcessManager12init_threadsEvEUlvE_vEEE6_M_runEv Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecad5d8070 (unknown) Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecacdf6e25 start_thread Nov 16 11:57:24 int-mountvolumeagent2-soak112s.testing.mesosphe.re mesos-agent[26663]: @ 0x7fecacb20bad __clone {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)