[jira] [Commented] (MESOS-9083) Test ReservationEndpointsTest.ReserveAndUnreserveNoAuthentication is flaky.
[ https://issues.apache.org/jira/browse/MESOS-9083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710575#comment-16710575 ] Vinod Kone commented on MESOS-9083: --- Still happening on ASF CI. {code} [ RUN ] ReservationEndpointsTest.ReserveAndUnreserveNoAuthentication I1205 16:30:33.806411 22505 cluster.cpp:173] Creating default 'local' authorizer I1205 16:30:33.809387 22511 master.cpp:413] Master 80f814ea-0afc-4cec-8891-dfe913ca3075 (9b6ccb5930cd) started on 172.17.0.3:36088 I1205 16:30:33.809422 22511 master.cpp:416] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1000 secs" --allocator="hierarchical" --authenticate_agents="true" --authenticate_frameworks="false" --authenticate_http_frameworks="true" --authenticate_http_readonly="t rue" --authenticate_http_readwrite="false" --authentication_v0_timeout="15secs" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/7ITn89/credentia ls" --filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="bas ic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --memory_profiling="false" --min_allocatable_resources="cpus:0.01|mem:32" --port="5050" --publish_per_framework_metrics="true" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false" --role_sorter="drf" --roles="role" --root_submissions="true" --version="false" --webui_dir="/tmp/SRC/build/mesos-1.8.0/_inst/share/mesos/webui" --work_dir="/tmp/7ITn89/master" --zk_session_timeout="10secs" I1205 16:30:33.809890 22511 master.cpp:467] Master allowing unauthenticated frameworks to register I1205 16:30:33.809912 22511 master.cpp:471] Master only allowing authenticated agents to register I1205 16:30:33.809926 22511 master.cpp:477] Master only allowing authenticated HTTP frameworks to register I1205 16:30:33.809937 22511 credentials.hpp:37] Loading credentials for authentication from '/tmp/7ITn89/credentials' I1205 16:30:33.810329 22511 master.cpp:521] Using default 'crammd5' authenticator I1205 16:30:33.810554 22511 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I1205 16:30:33.810809 22511 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I1205 16:30:33.810992 22511 master.cpp:602] Authorization enabled W1205 16:30:33.811025 22511 master.cpp:665] The '--roles' flag is deprecated. This flag will be removed in the future. See the Mesos 0.27 upgrade notes for more information I1205 16:30:33.811547 22510 whitelist_watcher.cpp:77] No whitelist given I1205 16:30:33.811564 22508 hierarchical.cpp:175] Initialized hierarchical allocator process I1205 16:30:33.814721 22509 master.cpp:2105] Elected as the leading master! I1205 16:30:33.814755 22509 master.cpp:1660] Recovering from registrar I1205 16:30:33.814954 22514 registrar.cpp:339] Recovering registrar I1205 16:30:33.815670 22514 registrar.cpp:383] Successfully fetched the registry (0B) in 669952ns I1205 16:30:33.815798 22514 registrar.cpp:487] Applied 1 operations in 39331ns; attempting to update the registry I1205 16:30:33.816577 22508 registrar.cpp:544] Successfully updated the registry in 710912ns I1205 16:30:33.816747 22508 registrar.cpp:416] Successfully recovered registrar I1205 16:30:33.817325 22521 master.cpp:1774] Recovered 0 agents from the registry (135B); allowing 10mins for agents to reregister I1205 16:30:33.817361 22517 hierarchical.cpp:215] Skipping recovery of hierarchical allocator: nothing to recover W1205 16:30:33.823312 22505 process.cpp:2829] Attempted to spawn already running process files@172.17.0.3:36088 I1205 16:30:33.824642 22505 containerizer.cpp:305] Using isolation { environment_secret, posix/cpu, posix/mem, filesystem/posix, network/cni } W1205 16:30:33.825306 22505 backend.cpp:76] Failed to create 'aufs' backend: AufsBackend requires root privileges W1205 16:30:33.825335 22505 backend.cpp:76] Failed to create 'bind' backend: BindBackend requires root privileges I1205 16:30:33.825368 22505 provisioner.cpp:298] Using default backend 'copy' I1205 16:30:33.827760 22505 cluster.cpp:485] Creating default 'local' authorizer I1205 16:30:33.829742 22510 slave.cpp:267] Mesos agent started on (444)@172.17.0.3:36088 I1205 16:30:33.829778 22510 slave.cpp:268] Flags at startup: --acls="" --appc_simple_discovery_uri_prefix="http://;
[jira] [Created] (MESOS-9458) PersistentVolumeEndpointsTest.StaticReservation is flaky
Vinod Kone created MESOS-9458: - Summary: PersistentVolumeEndpointsTest.StaticReservation is flaky Key: MESOS-9458 URL: https://issues.apache.org/jira/browse/MESOS-9458 Project: Mesos Issue Type: Bug Components: allocation Reporter: Vinod Kone Observed this in ASF CI https://builds.apache.org/view/M-R/view/Mesos/job/Mesos-Buildbot-Test/310/BUILDTOOL=autotools,COMPILER=gcc,CONFIGURATION=--verbose%20--disable-libtool-wrappers%20--disable-parallel-test-execution,ENVIRONMENT=GLOG_v=1%20MESOS_VERBOSE=1%20MESOS_TEST_AWAIT_TIMEOUT=60secs,OS=ubuntu:16.04,label_exp=(ubuntu)&&(!ubuntu-us1)&&(!ubuntu-eu2)&&(!ubuntu-4)&&(!H21)&&(!H23)&&(!H26)&&(!H27)/consoleText {noformat} [ RUN ] PersistentVolumeEndpointsTest.StaticReservation I1205 11:34:05.896515 22538 cluster.cpp:173] Creating default 'local' authorizer I1205 11:34:05.898870 22542 master.cpp:413] Master 3f2d828b-bff8-461a-98cf-de9163b36657 (488de0351206) started on 172.17.0.2:40803 I1205 11:34:05.898895 22542 master.cpp:416] Flags at startup: --acls="" --agent_ping_timeout="15secs" --agent_reregister_timeout="10mins" --allocation_interval="1000secs" --allocator="hierarchical" --authenticate_agents="true" --authenticate_frameworks="true" --authenticate_http_frameworks="true" --authenticate_http_readonly="true" --authenticate_http_readwrite="true" --authentication_v0_timeout="15secs" --authenticators="crammd5" --authorizers="local" --credentials="/tmp/qOMyLF/credentials" --filter_gpu_resources="true" --framework_sorter="drf" --help="false" --hostname_lookup="true" --http_authenticators="basic" --http_framework_authenticators="basic" --initialize_driver_logging="true" --log_auto_initialize="true" --logbufsecs="0" --logging_level="INFO" --max_agent_ping_timeouts="5" --max_completed_frameworks="50" --max_completed_tasks_per_framework="1000" --max_unreachable_tasks_per_framework="1000" --memory_profiling="false" --min_allocatable_resources="cpus:0.01|mem:32" --port="5050" --publish_per_framework_metrics="true" --quiet="false" --recovery_agent_removal_limit="100%" --registry="in_memory" --registry_fetch_timeout="1mins" --registry_gc_interval="15mins" --registry_max_agent_age="2weeks" --registry_max_agent_count="102400" --registry_store_timeout="100secs" --registry_strict="false" --require_agent_domain="false" --role_sorter="drf" --roles="role1" --root_submissions="true" --version="false" --webui_dir="/tmp/SRC/build/mesos-1.8.0/_inst/share/mesos/webui" --work_dir="/tmp/qOMyLF/master" --zk_session_timeout="10secs" I1205 11:34:05.899194 22542 master.cpp:465] Master only allowing authenticated frameworks to register I1205 11:34:05.899205 22542 master.cpp:471] Master only allowing authenticated agents to register I1205 11:34:05.899212 22542 master.cpp:477] Master only allowing authenticated HTTP frameworks to register I1205 11:34:05.899219 22542 credentials.hpp:37] Loading credentials for authentication from '/tmp/qOMyLF/credentials' I1205 11:34:05.899503 22542 master.cpp:521] Using default 'crammd5' authenticator I1205 11:34:05.899674 22542 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readonly' I1205 11:34:05.899879 22542 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-readwrite' I1205 11:34:05.900029 22542 http.cpp:1042] Creating default 'basic' HTTP authenticator for realm 'mesos-master-scheduler' I1205 11:34:05.900211 22542 master.cpp:602] Authorization enabled W1205 11:34:05.900238 22542 master.cpp:665] The '--roles' flag is deprecated. This flag will be removed in the future. See the Mesos 0.27 upgrade notes for more information I1205 11:34:05.900684 22539 hierarchical.cpp:175] Initialized hierarchical allocator process I1205 11:34:05.900707 22545 whitelist_watcher.cpp:77] No whitelist given I1205 11:34:05.903553 22540 master.cpp:2105] Elected as the leading master! I1205 11:34:05.903587 22540 master.cpp:1660] Recovering from registrar I1205 11:34:05.903753 22551 registrar.cpp:339] Recovering registrar I1205 11:34:05.904373 22551 registrar.cpp:383] Successfully fetched the registry (0B) in 574976ns I1205 11:34:05.904498 22551 registrar.cpp:487] Applied 1 operations in 34823ns; attempting to update the registry I1205 11:34:05.905134 22551 registrar.cpp:544] Successfully updated the registry in 566016ns I1205 11:34:05.905258 22551 registrar.cpp:416] Successfully recovered registrar I1205 11:34:05.905829 22539 master.cpp:1774] Recovered 0 agents from the registry (135B); allowing 10mins for agents to reregister I1205 11:34:05.905889 22540 hierarchical.cpp:215] Skipping recovery of hierarchical allocator: nothing to recover W1205 11:34:05.918561 22538 process.cpp:2829] Attempted to spawn already running process files@172.17.0.2:40803 I1205 11:34:05.919775 22538 containerizer.cpp:305] Using isolation { environment_secret,
[jira] [Commented] (MESOS-6804) Running 'tty' inside a debug container that has a tty reports "Not a tty"
[ https://issues.apache.org/jira/browse/MESOS-6804?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710510#comment-16710510 ] Kevin Klues commented on MESOS-6804: Here are some relevant tickets to help resolve this as other systems have run into this issue in the past: https://bugs.launchpad.net/ubuntu/+source/glibc/+bug/1669578 https://github.com/opencontainers/runc/issues/814 https://github.com/lxc/lxd/issues/1724 https://github.com/moby/moby/issues/8755 https://github.com/opencontainers/runc/blob/v0.1.1/libcontainer/standard_init_linux.go#L58-L83 https://github.com/opencontainers/runc/blob/5d93fed3d27f1e2bab58bad13b180a7a81d0b378/libcontainer/standard_init_linux.go https://github.com/moby/moby/pull/33007 > Running 'tty' inside a debug container that has a tty reports "Not a tty" > - > > Key: MESOS-6804 > URL: https://issues.apache.org/jira/browse/MESOS-6804 > Project: Mesos > Issue Type: Improvement >Reporter: Kevin Klues >Priority: Major > Labels: debugging, mesosphere > > We need to inject `/dev/console` into the container and map it to the slave > end of the TTY we are attached to. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-8248) Expose information about GPU assigned to a task
[ https://issues.apache.org/jira/browse/MESOS-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710438#comment-16710438 ] Chang Lan commented on MESOS-8248: -- I would like to see GPU resource metrics be included in the monitor endpoints. Looks like MESOS-5255 fits my use case better. Thanks [~bmahler]. > Expose information about GPU assigned to a task > --- > > Key: MESOS-8248 > URL: https://issues.apache.org/jira/browse/MESOS-8248 > Project: Mesos > Issue Type: Improvement > Components: containerization, gpu >Reporter: Karthik Anantha Padmanabhan >Priority: Major > Labels: GPU > > As a framework author I'd like information about the gpu that was assigned to > a task. > `nvidia-smi` for example provides the following information GPU UUID, boardId > minor number etc. It would useful to expose this information when a task is > assigned to a GPU instance. > This will make it possible to monitor resource usage for a task on GPU which > is not possible when -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-8248) Expose information about GPU assigned to a task
[ https://issues.apache.org/jira/browse/MESOS-8248?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710387#comment-16710387 ] Benjamin Mahler commented on MESOS-8248: Looking at this again, gpu resource usage will be exposed alongside the existing container usage metrics via MESOS-5255. [~changlan] what is your use case for wanting the gpu assignment information exposed in endpoints and to the scheduler? > Expose information about GPU assigned to a task > --- > > Key: MESOS-8248 > URL: https://issues.apache.org/jira/browse/MESOS-8248 > Project: Mesos > Issue Type: Improvement > Components: containerization, gpu >Reporter: Karthik Anantha Padmanabhan >Priority: Major > Labels: GPU > > As a framework author I'd like information about the gpu that was assigned to > a task. > `nvidia-smi` for example provides the following information GPU UUID, boardId > minor number etc. It would useful to expose this information when a task is > assigned to a GPU instance. > This will make it possible to monitor resource usage for a task on GPU which > is not possible when -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9099) Add allocator quota tests regarding reserve/unreserve already allocated resources.
[ https://issues.apache.org/jira/browse/MESOS-9099?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710362#comment-16710362 ] Meng Zhu commented on MESOS-9099: - {noformat} commit ed02fb0ca406808db1b61ed3a3c821f1192e553e Author: Meng Zhu Date: Tue Jul 31 12:55:34 2018 -0700 Added tests to ensure correct quota accounting. Added two allocator tests to ensure reserving and unreserving allocated resources do not affect quota accounting. Review: https://reviews.apache.org/r/68138 {noformat} > Add allocator quota tests regarding reserve/unreserve already allocated > resources. > -- > > Key: MESOS-9099 > URL: https://issues.apache.org/jira/browse/MESOS-9099 > Project: Mesos > Issue Type: Task > Components: allocation >Reporter: Meng Zhu >Assignee: Meng Zhu >Priority: Major > Labels: mesosphere > > Add allocator quota tests regarding reserve/unreserve already allocated > resources: > - Reserve already allocated resources should not affect quota headroom; > - The same applies to unreserve allocated resources. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9457) SEGV During Initial Connect Processing
Gary Murphy created MESOS-9457: -- Summary: SEGV During Initial Connect Processing Key: MESOS-9457 URL: https://issues.apache.org/jira/browse/MESOS-9457 Project: Mesos Issue Type: Bug Components: java api Affects Versions: 1.6.1 Environment: CentOS7: Linux mcbride 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux java version "1.8.0_181" Java(TM) SE Runtime Environment (build 1.8.0_181-b13) Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode) Reporter: Gary Murphy Attachments: AbstractScheduler.java, hs_err_pid129798.log, hs_err_pid19228.log While running my Mesos Scheduler v1 implementation, the code sometimes gets a SEGV in the JNI code. It appears to fail in the 'connected' method of the attached code. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9456) Set `SCMP_FLTATR_CTL_LOG` attribute during initialization of Seccomp context
Andrei Budnik created MESOS-9456: Summary: Set `SCMP_FLTATR_CTL_LOG` attribute during initialization of Seccomp context Key: MESOS-9456 URL: https://issues.apache.org/jira/browse/MESOS-9456 Project: Mesos Issue Type: Task Components: containerization Reporter: Andrei Budnik Since version 4.14 the Linux kernel supports SECCOMP_FILTER_FLAG_LOG flag which can be used for enabling logging for all Seccomp filter operations except SECCOMP_RET_ALLOW. If a Seccomp filter does not allow the system call, then the kernel will print a message into dmesg during invocation of this system call. At the moment libseccomp ver. 2.3.3 does not provide this flag, but the latest master branch of libseccomp supports SECCOMP_FILTER_FLAG_LOG. So, we need to add {code:java} seccomp_attr_set(ctx, SCMP_FLTATR_CTL_LOG, 1);{code} into `SeccompFilter::create()` when the newest version of libseccomp will be released. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9455) Add tests for operation status acknowledgement for different combinations of uuid, agent_id and resource_provider_id
[ https://issues.apache.org/jira/browse/MESOS-9455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16710186#comment-16710186 ] Benjamin Bannier commented on MESOS-9455: - cc [~greggomann] [~gkleiman] > Add tests for operation status acknowledgement for different combinations of > uuid, agent_id and resource_provider_id > > > Key: MESOS-9455 > URL: https://issues.apache.org/jira/browse/MESOS-9455 > Project: Mesos > Issue Type: Task > Components: test >Reporter: Benjamin Bannier >Priority: Major > Labels: mesosphere, tech-debt > > An {{OperationStatus}} contains a {{UUID}}, {{AgentID}} and > {{ResourceProviderID}}. We should add tests and checks that a status with an > {{UUID}} can always be acknowledged by frameworks. The following combinations > with set UUID are possible: > ||{{UUID}}||{{AgentID}}||{{ResourceProviderID}}||explanation|| > |✔️|Ă|Ă|master-generated {{UUID}}; currently not possible| > |✔️|✔️|Ă|operation on agent-default resources; not possible until a SUM for > agent operations is added, MESOS-9278| > |✔️|✔️|✔️|operation on LRP resources; supported| > |✔️|Ă|✔️|operation on ERP resources; not implemented| -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9455) Add tests for operation status acknowledgement for different combinations of uuid, agent_id and resource_provider_id
Benjamin Bannier created MESOS-9455: --- Summary: Add tests for operation status acknowledgement for different combinations of uuid, agent_id and resource_provider_id Key: MESOS-9455 URL: https://issues.apache.org/jira/browse/MESOS-9455 Project: Mesos Issue Type: Task Components: test Reporter: Benjamin Bannier An {{OperationStatus}} contains a {{UUID}}, {{AgentID}} and {{ResourceProviderID}}. We should add tests and checks that a status with an {{UUID}} can always be acknowledged by frameworks. The following combinations with set UUID are possible: ||{{UUID}}||{{AgentID}}||{{ResourceProviderID}}||explanation|| |✔️|Ă|Ă|master-generated {{UUID}}; currently not possible| |✔️|✔️|Ă|operation on agent-default resources; not possible until a SUM for agent operations is added, MESOS-9278| |✔️|✔️|✔️|operation on LRP resources; supported| |✔️|Ă|✔️|operation on ERP resources; not implemented| -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9454) Testing JIRA issue sync
Marco Monaco created MESOS-9454: --- Summary: Testing JIRA issue sync Key: MESOS-9454 URL: https://issues.apache.org/jira/browse/MESOS-9454 Project: Mesos Issue Type: Task Reporter: Marco Monaco Assignee: Vinod Kone This is just a test... -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (MESOS-9453) Libprocess does not handle "identity" encoding rules
Benno Evers created MESOS-9453: -- Summary: Libprocess does not handle "identity" encoding rules Key: MESOS-9453 URL: https://issues.apache.org/jira/browse/MESOS-9453 Project: Mesos Issue Type: Bug Reporter: Benno Evers [RFC 7231|https://tools.ietf.org/html/rfc7231#section-5.3.4], as well as the relevant [libprocess comment|https://github.com/apache/mesos/blob/dad74012fa02a7fbf61b09968d9b7e9c730b1c97/3rdparty/libprocess/src/http.cpp#L315-L325] mention special handling of the "identity" encoding. However, this is currently ignored in mesos, which can lead to incorrect behaviour in combination with MESOS-9451: {noformat} $ nc localhost 5050 GET /tasks HTTP/1.1 Accept-Encoding: gzip, identity;q=0 HTTP/1.1 200 OK Date: Wed, 05 Dec 2018 11:02:24 GMT Content-Type: application/json Content-Length: 12 {"tasks":[]} {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (MESOS-9451) Libprocess endpoints can ignore required gzip compression
[ https://issues.apache.org/jira/browse/MESOS-9451?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16709943#comment-16709943 ] Benno Evers commented on MESOS-9451: Good point, I've opened MESOS-9453 for our lack of handling of the "identity" encoding. > Libprocess endpoints can ignore required gzip compression > - > > Key: MESOS-9451 > URL: https://issues.apache.org/jira/browse/MESOS-9451 > Project: Mesos > Issue Type: Bug >Reporter: Benno Evers >Priority: Major > Labels: libprocess > > Currently, libprocess decides whether a response should be compressed by the > following conditional: > {noformat} > if (response.type == http::Response::BODY && > response.body.length() >= GZIP_MINIMUM_BODY_LENGTH && > !headers.contains("Content-Encoding") && > request.acceptsEncoding("gzip")) { > [...] > {noformat} > However, this implies that a request sent with the header "Accept-Encoding: > gzip" can not rely on actually getting a gzipped response, e.g. when the > response size is below the threshold: > {noformat} > $ nc localhost 5050 > GET /tasks HTTP/1.1 > Accept-Encoding: gzip > HTTP/1.1 200 OK > Date: Tue, 04 Dec 2018 12:49:56 GMT > Content-Type: application/json > Content-Length: 12 > {"tasks":[]} > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)