[jira] [Commented] (MESOS-4492) Add metrics for {RESERVE, UNRESERVE} and {CREATE, DESTROY} offer operation
[ https://issues.apache.org/jira/browse/MESOS-4492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168587#comment-15168587 ] Fan Du commented on MESOS-4492: --- Thanks for the kind notice :) > Add metrics for {RESERVE, UNRESERVE} and {CREATE, DESTROY} offer operation > -- > > Key: MESOS-4492 > URL: https://issues.apache.org/jira/browse/MESOS-4492 > Project: Mesos > Issue Type: Improvement > Components: master >Reporter: Fan Du >Assignee: Fan Du >Priority: Minor > > This ticket aims to enable user or operator to inspect operation statistics > such as RESERVE, UNRESERVE, CREATE and DESTROY, current implementation only > supports LAUNCH. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4779) strip query strings and fragments from fetcher urls
[ https://issues.apache.org/jira/browse/MESOS-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168300#comment-15168300 ] Shuai Lin commented on MESOS-4779: -- What about adding an extra "name" field to the uri PB? That would make it more flexible. (Though this may be addresses in a new ticket). > strip query strings and fragments from fetcher urls > --- > > Key: MESOS-4779 > URL: https://issues.apache.org/jira/browse/MESOS-4779 > Project: Mesos > Issue Type: Bug > Components: fetcher >Reporter: James Peach >Assignee: James Peach >Priority: Minor > > When the fetcher URL contains a query string or fragment, the fetcher is > unable to check the file extension. We should strip the query and fragment > when constructing the Fetcher basename, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3011) Publish release documentation for major releases on website
[ https://issues.apache.org/jira/browse/MESOS-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anand Mazumdar updated MESOS-3011: -- Labels: documentation mesosphere (was: documentation) > Publish release documentation for major releases on website > --- > > Key: MESOS-3011 > URL: https://issues.apache.org/jira/browse/MESOS-3011 > Project: Mesos > Issue Type: Documentation > Components: documentation, project website >Reporter: Paul Brett >Assignee: Joerg Schad > Labels: documentation, mesosphere > > Currently, the website only provides a single version of the documentation. > We should publish documentation for each release on the website independently > (for example as https://mesos.apache.org/documentation/0.22/index.html, > https://mesos.apache.org/documentation/0.23/index.html) and make latest > redirect to the current version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4772) TaskInfo/ExecutorInfo should include owner information
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168255#comment-15168255 ] Qian Zhang commented on MESOS-4772: --- Very interesting ticket. A few questions: 1. What is the relationship between user and role? Should a user always belong to role(s)? Or actually they are orthogonal? 2. Do we need to do user authentication (E.g., based on username/password or credential) in Mesos? Or it will be actually left to framework to handle in its own way? 3. Do we plan to integrate Mesos with some user management systems (e.g., LDAP, Active Directory, etc.) in future? > TaskInfo/ExecutorInfo should include owner information > -- > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B >Assignee: Jan Schlicht > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4632) Document semantics of HTTP endpoints
[ https://issues.apache.org/jira/browse/MESOS-4632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-4632: --- Labels: documentation endpoint mesosphere (was: documentation mesosphere) > Document semantics of HTTP endpoints > > > Key: MESOS-4632 > URL: https://issues.apache.org/jira/browse/MESOS-4632 > Project: Mesos > Issue Type: Documentation > Components: documentation, json api >Reporter: Neil Conway > Labels: documentation, endpoint, mesosphere > > It would be helpful to have more information about the operator endpoints. > For example: > * Which endpoints require authentication (if enabled) > * Which endpoints support which HTTP methods > * The number and format of the parameters expected by each HTTP method -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-3902) Mesos HTTP Scheduler API does not perform master redirection when it's not the leader
[ https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15168211#comment-15168211 ] Ben Whitehead commented on MESOS-3902: -- The master now sets a location header, but it's incomplete. The path of the URL isn't set. Is this intentional? My understanding of the location header is that it's the authority and whatever is provided is supposed to be used. {code} > cat /tmp/subscribe-1072944352375841456 | httpp POST > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf POST /api/v1/scheduler HTTP/1.1 Accept: application/json Accept-Encoding: gzip, deflate Connection: keep-alive Content-Length: 123 Content-Type: application/x-protobuf Host: 127.1.0.3:5050 User-Agent: HTTPie/0.9.0 +-+ | NOTE: binary data not shown in terminal | +-+ HTTP/1.1 307 Temporary Redirect Content-Length: 0 Date: Fri, 26 Feb 2016 00:54:41 GMT Location: //127.1.0.1:5050 {code} > Mesos HTTP Scheduler API does not perform master redirection when it's not > the leader > - > > Key: MESOS-3902 > URL: https://issues.apache.org/jira/browse/MESOS-3902 > Project: Mesos > Issue Type: Bug > Components: HTTP API, master >Affects Versions: 0.25.0 > Environment: 3 masters, 10 slaves >Reporter: Ben Whitehead > Labels: mesosphere > > When I attempt to send a {{SUBSCRIBE}} call to a non-leading master instead > of getting a 307 as is outlined > [here|https://github.com/apache/mesos/blob/master/docs/scheduler-http-api.md#master-detection] > I get a 503. > {code} > $ cat /tmp/subscribe-943257503176798091.bin | http --print=HhBb --stream > --pretty=colors POST http://localhost:6060/api/v1/scheduler > Accept:application/x-protobuf Content-Type:application/x-protobuf > > > POST /api/v1/scheduler HTTP/1.1 > User-Agent: HTTPie/0.9.0 > Host: localhost:6060 > Content-Length: 126 > Connection: keep-alive > Accept-Encoding: gzip, deflate > Content-Type: application/x-protobuf > Accept: application/x-protobuf > +-+ > | NOTE: binary data not shown in terminal | > +-+ > HTTP/1.1 503 Service Unavailable > Date: Wed, 11 Nov 2015 23:02:23 GMT > Content-Length: 22 > Not the leading master > {code} > To verify this behavior I started three slaves on my local machines on ports > {{6060, 6061, 6062}}. The leader when I ran the above command was {{6061}}. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-4767) Apply batching to allocation events to reduce allocator backlogging.
[ https://issues.apache.org/jira/browse/MESOS-4767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guangya Liu reassigned MESOS-4767: -- Assignee: Guangya Liu > Apply batching to allocation events to reduce allocator backlogging. > > > Key: MESOS-4767 > URL: https://issues.apache.org/jira/browse/MESOS-4767 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Benjamin Mahler >Assignee: Guangya Liu > > Per the > [discussion|https://issues.apache.org/jira/browse/MESOS-3157?focusedCommentId=14728377=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14728377] > that came out of MESOS-3157, we'd like to batch together outstanding > allocation dispatches in order to avoid backing up the allocator. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2606) Remove mem_file_bytes and mem_anon_bytes in 0.24.0
[ https://issues.apache.org/jira/browse/MESOS-2606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Greg Mann updated MESOS-2606: - Assignee: (was: Chi Zhang) Labels: mesosphere twitter (was: twitter) > Remove mem_file_bytes and mem_anon_bytes in 0.24.0 > -- > > Key: MESOS-2606 > URL: https://issues.apache.org/jira/browse/MESOS-2606 > Project: Mesos > Issue Type: Task >Affects Versions: 0.23.0 >Reporter: Chi Zhang > Labels: mesosphere, twitter > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (MESOS-3011) Publish release documentation for major releases on website
[ https://issues.apache.org/jira/browse/MESOS-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Joerg Schad reassigned MESOS-3011: -- Assignee: Joerg Schad > Publish release documentation for major releases on website > --- > > Key: MESOS-3011 > URL: https://issues.apache.org/jira/browse/MESOS-3011 > Project: Mesos > Issue Type: Documentation > Components: documentation, project website >Reporter: Paul Brett >Assignee: Joerg Schad > Labels: documentation > > Currently, the website only provides a single version of the documentation. > We should publish documentation for each release on the website independently > (for example as https://mesos.apache.org/documentation/0.22/index.html, > https://mesos.apache.org/documentation/0.23/index.html) and make latest > redirect to the current version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3011) Publish release documentation for major releases on website
[ https://issues.apache.org/jira/browse/MESOS-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-3011: --- Labels: documentation (was: ) > Publish release documentation for major releases on website > --- > > Key: MESOS-3011 > URL: https://issues.apache.org/jira/browse/MESOS-3011 > Project: Mesos > Issue Type: Documentation > Components: documentation, project website >Reporter: Paul Brett > Labels: documentation > > Currently, the website only provides a single version of the documentation. > We should publish documentation for each release on the website independently > (for example as https://mesos.apache.org/documentation/0.22/index.html, > https://mesos.apache.org/documentation/0.23/index.html) and make latest > redirect to the current version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3011) Publish release documentation for major releases on website
[ https://issues.apache.org/jira/browse/MESOS-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-3011: --- Component/s: project website documentation > Publish release documentation for major releases on website > --- > > Key: MESOS-3011 > URL: https://issues.apache.org/jira/browse/MESOS-3011 > Project: Mesos > Issue Type: Documentation > Components: documentation, project website >Reporter: Paul Brett > Labels: documentation > > Currently, the website only provides a single version of the documentation. > We should publish documentation for each release on the website independently > (for example as https://mesos.apache.org/documentation/0.22/index.html, > https://mesos.apache.org/documentation/0.23/index.html) and make latest > redirect to the current version. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4782) Extend persistent volume test framework for multiple disks
Neil Conway created MESOS-4782: -- Summary: Extend persistent volume test framework for multiple disks Key: MESOS-4782 URL: https://issues.apache.org/jira/browse/MESOS-4782 Project: Mesos Issue Type: Improvement Reporter: Neil Conway -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3003) Support mounting in default configuration files/volumes into every new container
[ https://issues.apache.org/jira/browse/MESOS-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-3003: -- Assignee: Gilbert Song > Support mounting in default configuration files/volumes into every new > container > > > Key: MESOS-3003 > URL: https://issues.apache.org/jira/browse/MESOS-3003 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Timothy Chen >Assignee: Gilbert Song > Labels: mesosphere, unified-containerizer-mvp > > Most container images leave out system configuration (e.g: /etc/*) and expect > the container runtimes to mount in specific configurations as needed such as > /etc/resolv.conf from the host into the container when needed. > We need to support mounting in specific configuration files for command > executor to work, and also allow the user to optionally define other > configuration files to mount in as well via flags. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3003) Support mounting in default configuration files/volumes into every new container
[ https://issues.apache.org/jira/browse/MESOS-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-3003: -- Shepherd: Jie Yu > Support mounting in default configuration files/volumes into every new > container > > > Key: MESOS-3003 > URL: https://issues.apache.org/jira/browse/MESOS-3003 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Timothy Chen > Labels: mesosphere, unified-containerizer-mvp > > Most container images leave out system configuration (e.g: /etc/*) and expect > the container runtimes to mount in specific configurations as needed such as > /etc/resolv.conf from the host into the container when needed. > We need to support mounting in specific configuration files for command > executor to work, and also allow the user to optionally define other > configuration files to mount in as well via flags. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3003) Support mounting in default configuration files/volumes into every new container
[ https://issues.apache.org/jira/browse/MESOS-3003?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-3003: -- Sprint: Mesosphere Sprint 30 > Support mounting in default configuration files/volumes into every new > container > > > Key: MESOS-3003 > URL: https://issues.apache.org/jira/browse/MESOS-3003 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: Timothy Chen > Labels: mesosphere, unified-containerizer-mvp > > Most container images leave out system configuration (e.g: /etc/*) and expect > the container runtimes to mount in specific configurations as needed such as > /etc/resolv.conf from the host into the container when needed. > We need to support mounting in specific configuration files for command > executor to work, and also allow the user to optionally define other > configuration files to mount in as well via flags. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4676) ROOT_DOCKER_Logs is flaky.
[ https://issues.apache.org/jira/browse/MESOS-4676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167831#comment-15167831 ] Joseph Wu commented on MESOS-4676: -- {code} commit d0d4d5a64e8aa17f0bc364060d98690b49037550 Author: Joseph WuDate: Thu Feb 25 12:22:07 2016 +0100 Fixed flakiness in DockerContainerizerTest.ROOT_DOCKER_Logs. Adds the `unbuffer` utility in front of each `echo` in the test. Since Docker appears to handle simultaneous stdout/stderr in a non-robust fashion, this mitigates the amount of overlap the two streams will have in the test. Review: https://reviews.apache.org/r/43963/ {code} > ROOT_DOCKER_Logs is flaky. > -- > > Key: MESOS-4676 > URL: https://issues.apache.org/jira/browse/MESOS-4676 > Project: Mesos > Issue Type: Bug >Affects Versions: 0.27 > Environment: CentOS 7 with SSL. >Reporter: Bernd Mathiske >Assignee: Joseph Wu > Labels: flaky, mesosphere, test > Fix For: 0.28.0 > > > {noformat} > [18:06:25][Step 8/8] [ RUN ] DockerContainerizerTest.ROOT_DOCKER_Logs > [18:06:25][Step 8/8] I0215 17:06:25.256103 1740 leveldb.cpp:174] Opened db > in 6.548327ms > [18:06:25][Step 8/8] I0215 17:06:25.258002 1740 leveldb.cpp:181] Compacted > db in 1.837816ms > [18:06:25][Step 8/8] I0215 17:06:25.258059 1740 leveldb.cpp:196] Created db > iterator in 22044ns > [18:06:25][Step 8/8] I0215 17:06:25.258076 1740 leveldb.cpp:202] Seeked to > beginning of db in 2347ns > [18:06:25][Step 8/8] I0215 17:06:25.258091 1740 leveldb.cpp:271] Iterated > through 0 keys in the db in 571ns > [18:06:25][Step 8/8] I0215 17:06:25.258152 1740 replica.cpp:779] Replica > recovered with log positions 0 -> 0 with 1 holes and 0 unlearned > [18:06:25][Step 8/8] I0215 17:06:25.258936 1758 recover.cpp:447] Starting > replica recovery > [18:06:25][Step 8/8] I0215 17:06:25.259177 1758 recover.cpp:473] Replica is > in EMPTY status > [18:06:25][Step 8/8] I0215 17:06:25.260327 1757 replica.cpp:673] Replica in > EMPTY status received a broadcasted recover request from > (13608)@172.30.2.239:39785 > [18:06:25][Step 8/8] I0215 17:06:25.260545 1758 recover.cpp:193] Received a > recover response from a replica in EMPTY status > [18:06:25][Step 8/8] I0215 17:06:25.261065 1757 master.cpp:376] Master > 112363e2-c680-4946-8fee-d0626ed8b21e (ip-172-30-2-239.mesosphere.io) started > on 172.30.2.239:39785 > [18:06:25][Step 8/8] I0215 17:06:25.261209 1761 recover.cpp:564] Updating > replica status to STARTING > [18:06:25][Step 8/8] I0215 17:06:25.261086 1757 master.cpp:378] Flags at > startup: --acls="" --allocation_interval="1secs" > --allocator="HierarchicalDRF" --authenticate="true" > --authenticate_http="true" --authenticate_slaves="true" > --authenticators="crammd5" --authorizers="local" > --credentials="/tmp/HncLLj/credentials" --framework_sorter="drf" > --help="false" --hostname_lookup="true" --http_authenticators="basic" > --initialize_driver_logging="true" --log_auto_initialize="true" > --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" > --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" > --quiet="false" --recovery_slave_removal_limit="100%" > --registry="replicated_log" --registry_fetch_timeout="1mins" > --registry_store_timeout="100secs" --registry_strict="true" > --root_submissions="true" --slave_ping_timeout="15secs" > --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" > --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/HncLLj/master" > --zk_session_timeout="10secs" > [18:06:25][Step 8/8] I0215 17:06:25.261446 1757 master.cpp:423] Master only > allowing authenticated frameworks to register > [18:06:25][Step 8/8] I0215 17:06:25.261456 1757 master.cpp:428] Master only > allowing authenticated slaves to register > [18:06:25][Step 8/8] I0215 17:06:25.261462 1757 credentials.hpp:35] Loading > credentials for authentication from '/tmp/HncLLj/credentials' > [18:06:25][Step 8/8] I0215 17:06:25.261723 1757 master.cpp:468] Using > default 'crammd5' authenticator > [18:06:25][Step 8/8] I0215 17:06:25.261855 1757 master.cpp:537] Using > default 'basic' HTTP authenticator > [18:06:25][Step 8/8] I0215 17:06:25.262022 1757 master.cpp:571] > Authorization enabled > [18:06:25][Step 8/8] I0215 17:06:25.262177 1755 hierarchical.cpp:144] > Initialized hierarchical allocator process > [18:06:25][Step 8/8] I0215 17:06:25.262177 1758 whitelist_watcher.cpp:77] No > whitelist given > [18:06:25][Step 8/8] I0215 17:06:25.262899 1760 leveldb.cpp:304] Persisting > metadata (8 bytes) to leveldb took 1.517992ms > [18:06:25][Step 8/8] I0215 17:06:25.262924 1760 replica.cpp:320] Persisted > replica status to STARTING > [18:06:25][Step 8/8] I0215
[jira] [Created] (MESOS-4780) Remove `user` and `rootfs` flags in Windows launcher.
Alex Clemmer created MESOS-4780: --- Summary: Remove `user` and `rootfs` flags in Windows launcher. Key: MESOS-4780 URL: https://issues.apache.org/jira/browse/MESOS-4780 Project: Mesos Issue Type: Bug Reporter: Alex Clemmer Assignee: Alex Clemmer -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4779) strip query strings and fragments from fetcher urls
[ https://issues.apache.org/jira/browse/MESOS-4779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167603#comment-15167603 ] James Peach commented on MESOS-4779: https://reviews.apache.org/r/44029/ > strip query strings and fragments from fetcher urls > --- > > Key: MESOS-4779 > URL: https://issues.apache.org/jira/browse/MESOS-4779 > Project: Mesos > Issue Type: Bug > Components: fetcher >Reporter: James Peach >Assignee: James Peach >Priority: Minor > > When the fetcher URL contains a query string or fragment, the fetcher is > unable to check the file extension. We should strip the query and fragment > when constructing the Fetcher basename, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4779) strip query strings and fragments from fetcher urls
James Peach created MESOS-4779: -- Summary: strip query strings and fragments from fetcher urls Key: MESOS-4779 URL: https://issues.apache.org/jira/browse/MESOS-4779 Project: Mesos Issue Type: Bug Components: fetcher Reporter: James Peach Assignee: James Peach Priority: Minor When the fetcher URL contains a query string or fragment, the fetcher is unable to check the file extension. We should strip the query and fragment when constructing the Fetcher basename, -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4750) Document: Mesos Executor expects all SSL_* environment variables to be set
[ https://issues.apache.org/jira/browse/MESOS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4750: - Assignee: Jan Schlicht > Document: Mesos Executor expects all SSL_* environment variables to be set > -- > > Key: MESOS-4750 > URL: https://issues.apache.org/jira/browse/MESOS-4750 > Project: Mesos > Issue Type: Documentation > Components: documentation, general, slave >Affects Versions: 0.26.0 >Reporter: pawan >Assignee: Jan Schlicht > Labels: documentation, mesosphere, ssl > Original Estimate: 2h > Remaining Estimate: 2h > > I was trying to run Docker containers in a fully SSL-ized Mesos cluster but > ran into problems because the executor was failing with a "Failed to shutdown > socket with fd 10: Transport endpoint is not connected". > My understanding of why this is happening is because the executor was trying > to report its status to Mesos slave over HTTPS, but doesnt have the > appropriate certs/env setup inside the executor. > (Thanks to mslackbot/joseph for helping me figure this out on #mesos) > It turns out, the executor expects all SSL_* variables to be set inside > `CommandInfo.environment` which gets picked up by the executor to > successfully reports its status to the slave. > This part of __executor needing all the SSL_* variables to be set in its > environment__ is missing in the Mesos SSL transitioning guide. I request you > to please add this vital information to the doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4778) Add appc/runtime isolator for runtime isolation for appc images.
Jie Yu created MESOS-4778: - Summary: Add appc/runtime isolator for runtime isolation for appc images. Key: MESOS-4778 URL: https://issues.apache.org/jira/browse/MESOS-4778 Project: Mesos Issue Type: Task Reporter: Jie Yu Appc image also contains runtime information like 'exec', 'env', 'workingDirectory' etc. https://github.com/appc/spec/blob/master/spec/aci.md Similar to docker images, we need to support a subset of them (mainly 'exec', 'env' and 'workingDirectory'). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4772) TaskInfo/ExecutorInfo should include owner information
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4772: - Assignee: Jan Schlicht > TaskInfo/ExecutorInfo should include owner information > -- > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B >Assignee: Jan Schlicht > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4750) Document: Mesos Executor expects all SSL_* environment variables to be set
[ https://issues.apache.org/jira/browse/MESOS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4750: -- Sprint: Mesosphere Sprint 30 > Document: Mesos Executor expects all SSL_* environment variables to be set > -- > > Key: MESOS-4750 > URL: https://issues.apache.org/jira/browse/MESOS-4750 > Project: Mesos > Issue Type: Documentation > Components: documentation, general, slave >Affects Versions: 0.26.0 >Reporter: pawan > Labels: documentation, mesosphere, ssl > Original Estimate: 2h > Remaining Estimate: 2h > > I was trying to run Docker containers in a fully SSL-ized Mesos cluster but > ran into problems because the executor was failing with a "Failed to shutdown > socket with fd 10: Transport endpoint is not connected". > My understanding of why this is happening is because the executor was trying > to report its status to Mesos slave over HTTPS, but doesnt have the > appropriate certs/env setup inside the executor. > (Thanks to mslackbot/joseph for helping me figure this out on #mesos) > It turns out, the executor expects all SSL_* variables to be set inside > `CommandInfo.environment` which gets picked up by the executor to > successfully reports its status to the slave. > This part of __executor needing all the SSL_* variables to be set in its > environment__ is missing in the Mesos SSL transitioning guide. I request you > to please add this vital information to the doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4750) Document: Mesos Executor expects all SSL_* environment variables to be set
[ https://issues.apache.org/jira/browse/MESOS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4750: -- Story Points: 2 > Document: Mesos Executor expects all SSL_* environment variables to be set > -- > > Key: MESOS-4750 > URL: https://issues.apache.org/jira/browse/MESOS-4750 > Project: Mesos > Issue Type: Documentation > Components: documentation, general, slave >Affects Versions: 0.26.0 >Reporter: pawan > Labels: documentation, mesosphere, ssl > Original Estimate: 2h > Remaining Estimate: 2h > > I was trying to run Docker containers in a fully SSL-ized Mesos cluster but > ran into problems because the executor was failing with a "Failed to shutdown > socket with fd 10: Transport endpoint is not connected". > My understanding of why this is happening is because the executor was trying > to report its status to Mesos slave over HTTPS, but doesnt have the > appropriate certs/env setup inside the executor. > (Thanks to mslackbot/joseph for helping me figure this out on #mesos) > It turns out, the executor expects all SSL_* variables to be set inside > `CommandInfo.environment` which gets picked up by the executor to > successfully reports its status to the slave. > This part of __executor needing all the SSL_* variables to be set in its > environment__ is missing in the Mesos SSL transitioning guide. I request you > to please add this vital information to the doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4776) Libprocess metrics/snapshot endpoint rate limiting should be configurable.
[ https://issues.apache.org/jira/browse/MESOS-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4776: --- Description: Currently the {{/metrics/snapshot}} endpoint in libprocess has a [hard-coded|https://github.com/apache/mesos/blob/0.27.1/3rdparty/libprocess/include/process/metrics/metrics.hpp#L52] rate limit of 2 requests per second: {code} MetricsProcess() : ProcessBase("metrics"), limiter(2, Seconds(1)) {} {code} This should be configurable via a libprocess environment variable so that users can control this when initializing libprocess. Summary: Libprocess metrics/snapshot endpoint rate limiting should be configurable. (was: It should be possible to disable rate limiting of the metrics endpoint for tests) > Libprocess metrics/snapshot endpoint rate limiting should be configurable. > -- > > Key: MESOS-4776 > URL: https://issues.apache.org/jira/browse/MESOS-4776 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > > Currently the {{/metrics/snapshot}} endpoint in libprocess has a > [hard-coded|https://github.com/apache/mesos/blob/0.27.1/3rdparty/libprocess/include/process/metrics/metrics.hpp#L52] > rate limit of 2 requests per second: > {code} > MetricsProcess() > : ProcessBase("metrics"), > limiter(2, Seconds(1)) {} > {code} > This should be configurable via a libprocess environment variable so that > users can control this when initializing libprocess. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4772) TaskInfo/ExecutorInfo should include owner information
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4772: -- Story Points: 2 > TaskInfo/ExecutorInfo should include owner information > -- > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4772) TaskInfo/ExecutorInfo should include owner information
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4772: - Sprint: Mesosphere Sprint 30 > TaskInfo/ExecutorInfo should include owner information > -- > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4750) Document: Mesos Executor expects all SSL_* environment variables to be set
[ https://issues.apache.org/jira/browse/MESOS-4750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4750: -- Summary: Document: Mesos Executor expects all SSL_* environment variables to be set (was: Mesos Executor expects all SSL_* environment variables to be set) > Document: Mesos Executor expects all SSL_* environment variables to be set > -- > > Key: MESOS-4750 > URL: https://issues.apache.org/jira/browse/MESOS-4750 > Project: Mesos > Issue Type: Documentation > Components: documentation, general, slave >Affects Versions: 0.26.0 >Reporter: pawan > Labels: documentation, mesosphere, ssl > Original Estimate: 2h > Remaining Estimate: 2h > > I was trying to run Docker containers in a fully SSL-ized Mesos cluster but > ran into problems because the executor was failing with a "Failed to shutdown > socket with fd 10: Transport endpoint is not connected". > My understanding of why this is happening is because the executor was trying > to report its status to Mesos slave over HTTPS, but doesnt have the > appropriate certs/env setup inside the executor. > (Thanks to mslackbot/joseph for helping me figure this out on #mesos) > It turns out, the executor expects all SSL_* variables to be set inside > `CommandInfo.environment` which gets picked up by the executor to > successfully reports its status to the slave. > This part of __executor needing all the SSL_* variables to be set in its > environment__ is missing in the Mesos SSL transitioning guide. I request you > to please add this vital information to the doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4630) Implement partition tests for the HTTP V1 API.
[ https://issues.apache.org/jira/browse/MESOS-4630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4630: - Sprint: Mesosphere Sprint 30 > Implement partition tests for the HTTP V1 API. > -- > > Key: MESOS-4630 > URL: https://issues.apache.org/jira/browse/MESOS-4630 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar > Labels: mesosphere > > Currently, the HTTP V1 API does not have partition tests similar to the one > in src/tests/partition_tests.cpp. > For more information see MESOS-3355. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4583) Rename `examples/event_call_framework.cpp` to `examples/test_http_framework.cpp`
[ https://issues.apache.org/jira/browse/MESOS-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4583: - Sprint: Mesosphere Sprint 30 > Rename `examples/event_call_framework.cpp` to > `examples/test_http_framework.cpp` > > > Key: MESOS-4583 > URL: https://issues.apache.org/jira/browse/MESOS-4583 > Project: Mesos > Issue Type: Bug >Reporter: Anand Mazumdar > Labels: mesosphere, newbie > > We already have {{examples/test_framework.cpp}} for testing {{PID}} based > frameworks. We would ideally want to rename {{event_call_framework}} to > correctly reflect that it's an example for HTTP based framework. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4777) Support port mapping in unified containerizer.
[ https://issues.apache.org/jira/browse/MESOS-4777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-4777: -- Labels: mesosphere (was: ) > Support port mapping in unified containerizer. > -- > > Key: MESOS-4777 > URL: https://issues.apache.org/jira/browse/MESOS-4777 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu > Labels: mesosphere > > For instance, if bridge network is used, we need to setup NAT rules so that > external hosts can access the services running inside the container. This is > similar to Docker's port mapping (DockerInfo.PortMapping). > We need to think about how to add the API (e.g., in NetworkInfo), and how to > add the implementation (e.g., in docker/runtime isolator or rely on CNI > plugin to do that). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4629) Implement fault tolerance tests for the HTTP V1 API.
[ https://issues.apache.org/jira/browse/MESOS-4629?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4629: - Sprint: Mesosphere Sprint 30 > Implement fault tolerance tests for the HTTP V1 API. > > > Key: MESOS-4629 > URL: https://issues.apache.org/jira/browse/MESOS-4629 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar > Labels: mesosphere > > Currently, the HTTP V1 API does not have fault tolerance tests similar to the > one in {{src/tests/fault_tolerance_tests.cpp}}. > For more information see MESOS-3355. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3583) Introduce sessions in HTTP Scheduler API
[ https://issues.apache.org/jira/browse/MESOS-3583?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-3583: - Sprint: Mesosphere Sprint 30 > Introduce sessions in HTTP Scheduler API > > > Key: MESOS-3583 > URL: https://issues.apache.org/jira/browse/MESOS-3583 > Project: Mesos > Issue Type: Task >Reporter: Anand Mazumdar >Assignee: Greg Mann > Labels: mesosphere, tech-debt > > Currently, the HTTP Scheduler API has no concept of Sessions aka > {{SessionID}} or a {{TokenID}}. This is useful in some failure scenarios. As > of now, if a framework fails over and then subscribes again with the same > {{FrameworkID}} with the {{force}} option set, the Mesos master would > subscribe it. > If the previous instance of the framework/scheduler tries to send a Call , > e.g. {{Call::KILL}} with the same previous {{FrameworkID}} set, it would be > still accepted by the master leading to erroneously killing a task. > This is possible because we do not have a way currently of distinguishing > connections. It used to work in the previous driver implementation due to the > master also performing a {{UPID}} check to verify if they matched and only > then allowing the call. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4776) It should be possible to disable rate limiting of the metrics endpoint for tests
[ https://issues.apache.org/jira/browse/MESOS-4776?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4776: --- Component/s: libprocess > It should be possible to disable rate limiting of the metrics endpoint for > tests > > > Key: MESOS-4776 > URL: https://issues.apache.org/jira/browse/MESOS-4776 > Project: Mesos > Issue Type: Improvement > Components: libprocess >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4762) Setup proper DNS resolver for containers in network/cni isolator.
[ https://issues.apache.org/jira/browse/MESOS-4762?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4762: - Sprint: Mesosphere Sprint 30 > Setup proper DNS resolver for containers in network/cni isolator. > - > > Key: MESOS-4762 > URL: https://issues.apache.org/jira/browse/MESOS-4762 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Avinash Sridharan > Labels: mesosphere > > Please get more context from the design doc (MESOS-4742). > The CNI plugin will return the DNS information about the network. The > network/cni isolator needs to properly setup /etc/resolv.conf for the > container. We should consider the following cases: > 1) container is using host filesystem > 2) container is using a different filesystem > 3) custom executor and command executor -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4763) Add test mock for CNI plugins.
[ https://issues.apache.org/jira/browse/MESOS-4763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4763: - Sprint: Mesosphere Sprint 30 > Add test mock for CNI plugins. > -- > > Key: MESOS-4763 > URL: https://issues.apache.org/jira/browse/MESOS-4763 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Avinash Sridharan > Labels: mesosphere > > In order to test the network/cni isolator, we need to mock the behavior of an > CNI plugin. One option is to write a mock script which acts as a CNI plugin. > The isolator will talk to the mock script the same way it talks to an actual > CNI plugin. > The mock script can just join the host network? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4759) Add network/cni isolator for Mesos containerizer.
[ https://issues.apache.org/jira/browse/MESOS-4759?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4759: - Sprint: Mesosphere Sprint 30 > Add network/cni isolator for Mesos containerizer. > - > > Key: MESOS-4759 > URL: https://issues.apache.org/jira/browse/MESOS-4759 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Qian Zhang > > See the design doc for more context (MESOS-4742). > The isolator will interact with CNI plugins to create the network for the > container to join. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4764) The network/cni isolator should report assigned IP address.
[ https://issues.apache.org/jira/browse/MESOS-4764?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4764: - Sprint: Mesosphere Sprint 30 > The network/cni isolator should report assigned IP address. > > > Key: MESOS-4764 > URL: https://issues.apache.org/jira/browse/MESOS-4764 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Qian Zhang > > In order for service discovery to work in some cases, the network/cni > isolator needs to report the assigned IP address through the > isolator->status() interface. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4761) Add agent flags to allow operators to specify CNI plugin and config directories.
[ https://issues.apache.org/jira/browse/MESOS-4761?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Artem Harutyunyan updated MESOS-4761: - Sprint: Mesosphere Sprint 30 > Add agent flags to allow operators to specify CNI plugin and config > directories. > > > Key: MESOS-4761 > URL: https://issues.apache.org/jira/browse/MESOS-4761 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Qian Zhang > > According to design doc, we plan to add the following flags: > “--network_cni_plugins_dir” > Location of the CNI plugin binaries. The “network/cni” isolator will find CNI > plugins under this directory so that it can execute the plugins to add/delete > container from the CNI networks. It is the operator’s responsibility to > install the CNI plugin binaries in the specified directory. > “--network_cni_config_dir” > Location of the CNI network configuration files. For each network that > containers launched in Mesos agent can connect to, the operator should > install a network configuration file in JSON format in the specified > directory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4673) Agent fails to shutdown after re-registering period timed-out.
[ https://issues.apache.org/jira/browse/MESOS-4673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Mahler updated MESOS-4673: --- Component/s: docker > Agent fails to shutdown after re-registering period timed-out. > -- > > Key: MESOS-4673 > URL: https://issues.apache.org/jira/browse/MESOS-4673 > Project: Mesos > Issue Type: Bug > Components: docker >Reporter: Jan Schlicht >Assignee: Jan Schlicht > Labels: mesosphere > > Under certain conditions, when a mesos agent looses connection to the master > for an extended period of time (Say a switch fails), the master will > de-register the agent, and then when the agent comes back up, refuse to let > it register: {{Slave asked to shut down by master@10.102.25.1:5050 because > 'Slave attempted to re-register after removal'}}. > The agent doesn't seem to be able to properly shutdown and remove running > tasks as it should do to register as a new agent. Hence this message will > persist until it's resolved by manual intervetion. > This seems to be caused by Docker tasks that couldn't shutdown cleanly when > the agent is asked to shutdown running tasks to be able to register as a new > agent with the master. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4772) TaskInfo/ExecutorInfo should include owner information
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4772: -- Summary: TaskInfo/ExecutorInfo should include owner information (was: a) > TaskInfo/ExecutorInfo should include owner information > -- > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4772) a
[ https://issues.apache.org/jira/browse/MESOS-4772?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Adam B updated MESOS-4772: -- Shepherd: Adam B Summary: a (was: TaskInfo/ExecutorInfo should include owner information) > a > - > > Key: MESOS-4772 > URL: https://issues.apache.org/jira/browse/MESOS-4772 > Project: Mesos > Issue Type: Improvement > Components: security >Reporter: Adam B > Labels: authorization, mesosphere, ownership, security > > We need a way to assign fine-grained ownership to tasks/executors so that > multi-user frameworks can tell Mesos to associate the task with a user > identity (rather than just the framework principal+role). Then, when an HTTP > user requests to view the task's sandbox contents, or kill the task, or list > all tasks, the authorizer can determine whether to allow/deny/filter the > request based on finer-grained, user-level ownership. > Some systems may want TaskInfo.owner to represent a group rather than an > individual user. That's fine as long as the framework sets the field to the > group ID in such a way that a group-aware authorizer can interpret it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4777) Support port mapping in unified containerizer.
Jie Yu created MESOS-4777: - Summary: Support port mapping in unified containerizer. Key: MESOS-4777 URL: https://issues.apache.org/jira/browse/MESOS-4777 Project: Mesos Issue Type: Task Reporter: Jie Yu For instance, if bridge network is used, we need to setup NAT rules so that external hosts can access the services running inside the container. This is similar to Docker's port mapping (DockerInfo.PortMapping). We need to think about how to add the API (e.g., in NetworkInfo), and how to add the implementation (e.g., in docker/runtime isolator or rely on CNI plugin to do that). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-2971) Implement OverlayFS based provisioner backend
[ https://issues.apache.org/jira/browse/MESOS-2971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-2971: -- Story Points: 5 > Implement OverlayFS based provisioner backend > - > > Key: MESOS-2971 > URL: https://issues.apache.org/jira/browse/MESOS-2971 > Project: Mesos > Issue Type: Improvement >Reporter: Timothy Chen >Assignee: Shuai Lin > Labels: mesosphere, twitter, unified-containerizer-mvp > > Part of the image provisioning process is to call a backend to create a root > filesystem based on the image on disk layout. > The problem with the copy backend is that it's both waste of IO and space, > and bind only can deal with one layer. > Overlayfs backend allows us to utilize the filesystem to merge multiple > filesystems into one efficiently. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-3193) Implement AppC image discovery.
[ https://issues.apache.org/jira/browse/MESOS-3193?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-3193: -- Sprint: Mesosphere Sprint 27, Mesosphere Sprint 28 (was: Mesosphere Sprint 27, Mesosphere Sprint 28, Mesosphere Sprint 29) > Implement AppC image discovery. > --- > > Key: MESOS-3193 > URL: https://issues.apache.org/jira/browse/MESOS-3193 > Project: Mesos > Issue Type: Task >Reporter: Yan Xu >Assignee: Jojy Varghese > Labels: mesosphere, twitter, unified-containerizer-mvp > > Appc spec specifies two image discovery mechanisms: simple and meta > discovery. We need to have an abstraction for image discovery in AppcStore. > For MVP, we can implement the simple discovery first. > https://reviews.apache.org/r/34139/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4047) MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
[ https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167418#comment-15167418 ] Alexander Rojas commented on MESOS-4047: So after fixing the issues raised in previous comments, I managed to reproduce the issue mentioned in the logs posted here. Apparently there is yet another race, where the executor exits before the line {{Future usage = containerizer2.get()->usage(containerId);}}. I managed to collect two verbose logs for a good and a bad run. I add only the important sections. Pay attention to lines which look like {{I0224 13:53:53.169703 25060 slave.cpp:3528] executor(1)@127.0.0.1:38732 exited}} The good run: {noformat} ... I0224 13:53:52.219846 25063 slave.cpp:1891] Asked to kill task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- Received killTask Shutting down Sending SIGTERM to process tree at pid 31659 Sent SIGTERM to the following process trees: [ -+- 31659 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done \--- 31661 dd count=512 bs=1M if=/dev/zero of=./temp ] Command terminated with signal Terminated (pid: 31659) I0224 13:53:52.369876 25062 slave.cpp:3002] Handling status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- from executor(1)@127.0.0.1:38732 I0224 13:53:52.386056 25059 mem.cpp:353] Updated 'memory.soft_limit_in_bytes' to 32MB for container d78a1f77-a3a1-44e4-9898-a62523a1c1e0 I0224 13:53:53.113471 25059 mem.cpp:388] Updated 'memory.limit_in_bytes' to 32MB for container d78a1f77-a3a1-44e4-9898-a62523a1c1e0 I0224 13:53:53.117938 25059 status_update_manager.cpp:320] Received status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- I0224 13:53:53.118013 25059 status_update_manager.cpp:824] Checkpointing UPDATE for status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- I0224 13:53:53.146458 25058 slave.cpp:3400] Forwarding the update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- to master@127.0.0.1:57058 I0224 13:53:53.146702 25058 slave.cpp:3310] Sending acknowledgement for status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- to executor(1)@127.0.0.1:38732 I0224 13:53:53.147956 25062 master.cpp:4794] Status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- from slave 92632338-e777-41c7-a9a3-39dc62fdea4c-S0 at slave(278)@127.0.0.1:57058 (localhost) I0224 13:53:53.147989 25062 master.cpp:4842] Forwarding status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- I0224 13:53:53.148143 25062 master.cpp:6450] Updating the state of task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- (latest state: TASK_KILLED, status update state: TASK_KILLED) I0224 13:53:53.149320 25061 master.cpp:3952] Processing ACKNOWLEDGE call 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8 for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- (default) at scheduler-79245611-a7d2-4220-bae1-4702a34ecf14@127.0.0.1:57058 on slave 92632338-e777-41c7-a9a3-39dc62fdea4c-S0 I0224 13:53:53.149684 25061 master.cpp:6516] Removing task 21236fe6-f5b3-4647-b4b0-fd83827436a3 with resources cpus(*):1; mem(*):256; disk(*):1024 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- on slave 92632338-e777-41c7-a9a3-39dc62fdea4c-S0 at slave(278)@127.0.0.1:57058 (localhost) I0224 13:53:53.150146 25061 status_update_manager.cpp:392] Received status update acknowledgement (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- I0224 13:53:53.150410 25061 status_update_manager.cpp:824] Checkpointing ACK for status update TASK_KILLED (UUID: 4f1a8c80-c4de-4e27-8fd1-79ecb89dcbd8) for task 21236fe6-f5b3-4647-b4b0-fd83827436a3 of framework 92632338-e777-41c7-a9a3-39dc62fdea4c- I0224 13:53:53.153118 25056 sched.cpp:1903] Asked to stop the driver I0224 13:53:53.153228 25064 sched.cpp:1143] Stopping framework '92632338-e777-41c7-a9a3-39dc62fdea4c-' I0224 13:53:53.154057 25061 master.cpp:5926] Processing TEARDOWN call for framework
[jira] [Created] (MESOS-4776) It should be possible to disable rate limiting of the metrics endpoint for tests
Benjamin Bannier created MESOS-4776: --- Summary: It should be possible to disable rate limiting of the metrics endpoint for tests Key: MESOS-4776 URL: https://issues.apache.org/jira/browse/MESOS-4776 Project: Mesos Issue Type: Improvement Reporter: Benjamin Bannier Assignee: Benjamin Bannier -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4758) Add a 'name' field into NetworkInfo.
[ https://issues.apache.org/jira/browse/MESOS-4758?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167304#comment-15167304 ] Qian Zhang commented on MESOS-4758: --- RR: https://reviews.apache.org/r/44004/ > Add a 'name' field into NetworkInfo. > > > Key: MESOS-4758 > URL: https://issues.apache.org/jira/browse/MESOS-4758 > Project: Mesos > Issue Type: Task >Reporter: Jie Yu >Assignee: Qian Zhang > > This allows the framework writer to specify the name of the network they want > their container to join. > Why not using 'groups'? That's because there might be multiple groups under a > single network (e.g., admin vs. user, public vs. private, etc.). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (MESOS-4775) Add allocator metric for number of active inverse offer filters
Benjamin Bannier created MESOS-4775: --- Summary: Add allocator metric for number of active inverse offer filters Key: MESOS-4775 URL: https://issues.apache.org/jira/browse/MESOS-4775 Project: Mesos Issue Type: Improvement Components: allocation Reporter: Benjamin Bannier Assignee: Benjamin Bannier -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4722) Add allocator metric for number of active offer filters
[ https://issues.apache.org/jira/browse/MESOS-4722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Benjamin Bannier updated MESOS-4722: Summary: Add allocator metric for number of active offer filters (was: Add allocator metric for number of active filters) > Add allocator metric for number of active offer filters > --- > > Key: MESOS-4722 > URL: https://issues.apache.org/jira/browse/MESOS-4722 > Project: Mesos > Issue Type: Improvement > Components: allocation >Reporter: Benjamin Bannier >Assignee: Benjamin Bannier > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (MESOS-4047) MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky
[ https://issues.apache.org/jira/browse/MESOS-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bernd Mathiske updated MESOS-4047: -- Fix Version/s: (was: 0.27.0) > MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery is flaky > --- > > Key: MESOS-4047 > URL: https://issues.apache.org/jira/browse/MESOS-4047 > Project: Mesos > Issue Type: Bug > Components: test >Affects Versions: 0.26.0 > Environment: Ubuntu 14, gcc 4.8.4 >Reporter: Joseph Wu >Assignee: Alexander Rojas > Labels: flaky, flaky-test > Fix For: 0.28.0 > > > {code:title=Output from passed test} > [--] 1 test from MemoryPressureMesosTest > 1+0 records in > 1+0 records out > 1048576 bytes (1.0 MB) copied, 0.000430889 s, 2.4 GB/s > [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery > I1202 11:09:14.319327 5062 exec.cpp:134] Version: 0.27.0 > I1202 11:09:14.17 5079 exec.cpp:208] Executor registered on slave > bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0 > Registered executor on ubuntu > Starting task 4e62294c-cfcf-4a13-b699-c6a4b7ac5162 > sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done' > Forked command at 5085 > I1202 11:09:14.391739 5077 exec.cpp:254] Received reconnect request from > slave bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0 > I1202 11:09:14.398598 5082 exec.cpp:231] Executor re-registered on slave > bea15b35-9aa1-4b57-96fb-29b5f70638ac-S0 > Re-registered executor on ubuntu > Shutting down > Sending SIGTERM to process tree at pid 5085 > Killing the following process trees: > [ > -+- 5085 sh -c while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done > \--- 5086 dd count=512 bs=1M if=/dev/zero of=./temp > ] > [ OK ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery (1096 ms) > {code} > {code:title=Output from failed test} > [--] 1 test from MemoryPressureMesosTest > 1+0 records in > 1+0 records out > 1048576 bytes (1.0 MB) copied, 0.000404489 s, 2.6 GB/s > [ RUN ] MemoryPressureMesosTest.CGROUPS_ROOT_SlaveRecovery > I1202 11:09:15.509950 5109 exec.cpp:134] Version: 0.27.0 > I1202 11:09:15.568183 5123 exec.cpp:208] Executor registered on slave > 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0 > Registered executor on ubuntu > Starting task 14b6bab9-9f60-4130-bdc4-44efba262bc6 > Forked command at 5132 > sh -c 'while true; do dd count=512 bs=1M if=/dev/zero of=./temp; done' > I1202 11:09:15.665498 5129 exec.cpp:254] Received reconnect request from > slave 88734acc-718e-45b0-95b9-d8f07cea8a9e-S0 > I1202 11:09:15.670995 5123 exec.cpp:381] Executor asked to shutdown > Shutting down > Sending SIGTERM to process tree at pid 5132 > ../../src/tests/containerizer/memory_pressure_tests.cpp:283: Failure > (usage).failure(): Unknown container: ebe90e15-72fa-4519-837b-62f43052c913 > *** Aborted at 1449083355 (unix time) try "date -d @1449083355" if you are > using GNU date *** > {code} > Notice that in the failed test, the executor is asked to shutdown when it > tries to reconnect to the agent. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (MESOS-4774) Wrong symbolic link of some Mesos libraries
[ https://issues.apache.org/jira/browse/MESOS-4774?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15167054#comment-15167054 ] Chen Zhiwei commented on MESOS-4774: https://reviews.apache.org/r/43999/ > Wrong symbolic link of some Mesos libraries > --- > > Key: MESOS-4774 > URL: https://issues.apache.org/jira/browse/MESOS-4774 > Project: Mesos > Issue Type: Bug >Reporter: Chen Zhiwei >Assignee: Chen Zhiwei > > When installing Mesos by `make install`, it will create symbolic links from > $(DESTDIR)/$(pkgmoduledir)/$$lib to $$lib. > As we can see it uses absolute path, but when building RPM/Debian package the > $(DESTDIR) is $BUILD_ROOT_DIR. Installing the RPM/Debian package on a target > host will broken the symbolic links since there is no $BUILD_ROOT_DIR on > target host. -- This message was sent by Atlassian JIRA (v6.3.4#6332)