[jira] [Created] (MESOS-4958) Implement clang-tidy check for log message style

2016-03-19 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-4958:
---

 Summary: Implement clang-tidy check for log message style
 Key: MESOS-4958
 URL: https://issues.apache.org/jira/browse/MESOS-4958
 Project: Mesos
  Issue Type: Improvement
Reporter: Benjamin Bannier


In most cases mesos log messages should not be explicitly terminated with a 
period.

We should add a check that message string passed to e.g., `LOG`, `std::cout` 
and `std::cerr`, or `CHECK*` do not end in periods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Robert Brockbank (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198150#comment-15198150
 ] 

Robert Brockbank commented on MESOS-4823:
-

While the CNI interface doesn't natively support port mapping, can Mesos not 
pass this information in to the CNI plugin using the `extra arguments` 
parameter?  That way the CNI plugin can be responsible for handling this 
without any risk of clashing ip tables entries etc.

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4981) Framework (re-)register metric counters broken for calls made via scheduler driver

2016-03-19 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4981:
-

 Summary: Framework (re-)register metric counters broken for calls 
made via scheduler driver
 Key: MESOS-4981
 URL: https://issues.apache.org/jira/browse/MESOS-4981
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Anand Mazumdar


The counters {{master/messages_register_framework}} and 
{master/messages_reregister_framework}} are no longer being incremented after 
the scheduler driver started sending {{Call}} messages to the master in Mesos 
0.23. Either, we should think about adding new counter(s) for {{Subscribe}} 
calls to the master for both PID/HTTP frameworks or modify the existing code to 
correctly increment the counters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Alex Pollitt (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200311#comment-15200311
 ] 

Alex Pollitt commented on MESOS-4823:
-

Avinash: I think you are conflating issues here.  CNI is 100% agnostic to layer 
4 (and above).  If your container is connected to a CNI network then it will 
have a uniquely identifiable IP address within that network, and every service 
it exposes is available on that IP address.  There is nothing going on at layer 
4 that make the service not addressable from the outside world.  For a CNI 
overlay network the thing that makes the service not addressable from the 
outside world is the layer 3 address (nothing to do with layer 4).  So I think 
that Dan's comment above is spot on.

There are a variety of ways you can get traffic in/out of an overlay network.  
iptables port mapping is just one way, and as Dan says, is dependent on the CNI 
network implementation.

For full disclosure, I work on Project Calico, which can operate in overlay 
mode or non-overlay mode as a CNI plugin. The iptables approach to port 
mapping, if implemented in such a way that it doesn't clash with Calico's own 
use of iptables, should work for getting traffic in/out of a Calico overlay 
network.  But it will not work for a bunch of other CNI network 
implementations.  

This is a thorny problem to solve generically.  I've seen people do it with 
iptables port mapping, with SDN specific solutions, with HA Proxy, and with 
things like kubeproxy (in Kubernetes land).  But I haven't seen a one size fits 
all solution yet because there is such a broad range of CNI network 
implementations.

(By the way, I am just down the road from Mesosphere HQ, so if it would be 
helpful to get in front of a whiteboard to help with any of this CNI stuff then 
just let me know.)



> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish to expose ports that micro-services are 
> listening on, to the outside world. When containers are running on bridged 
> (or ptp) networking this can be achieved by installing port forwarding rules 
> on the agent (using iptables). This can be done in the `network/cni` 
> isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4941) Support update existing quota

2016-03-19 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-4941:
---
Labels: Quota mesosphere  (was: Quota)

> Support update existing quota
> -
>
> Key: MESOS-4941
> URL: https://issues.apache.org/jira/browse/MESOS-4941
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>  Labels: Quota, mesosphere
>
> We want to support updating an existing quota without the cycle of delete and 
> recreate. This avoids the possible starvation risk of losing the quota 
> between delete and recreate, and also makes the interface friendly.
> Design doc:
> https://docs.google.com/document/d/1c8fJY9_N0W04FtUQ_b_kZM6S0eePU7eYVyfUP14dSys



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4633) Tests will dereference stack allocated agent objects upon assertion/expectation failure.

2016-03-19 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-4633:

Shepherd: Michael Park  (was: Bernd Mathiske)

> Tests will dereference stack allocated agent objects upon 
> assertion/expectation failure.
> 
>
> Key: MESOS-4633
> URL: https://issues.apache.org/jira/browse/MESOS-4633
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: flaky, mesosphere, tech-debt, test
> Fix For: 0.29.0
>
>
> Tests that use the {{StartSlave}} test helper are generally fragile when the 
> test fails an assert/expect in the middle of the test.  This is because the 
> {{StartSlave}} helper takes raw pointer arguments, which may be 
> stack-allocated.
> In case of an assert failure, the test immediately exits (destroying stack 
> allocated objects) and proceeds onto test cleanup.  The test cleanup may 
> dereference some of these destroyed objects, leading to a test crash like:
> {code}
> [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure 
> virtual method called
> [18:27:36][Step 8/8] @ 0x7f7077055e1c  google::LogMessage::Fail()
> [18:27:36][Step 8/8] @ 0x7f707705ba6f  google::RawLog__()
> [18:27:36][Step 8/8] @ 0x7f70760f76c9  __cxa_pure_virtual
> [18:27:36][Step 8/8] @   0xa9423c  
> mesos::internal::tests::Cluster::Slaves::shutdown()
> [18:27:36][Step 8/8] @  0x1074e45  
> mesos::internal::tests::MesosTest::ShutdownSlaves()
> [18:27:36][Step 8/8] @  0x1074de4  
> mesos::internal::tests::MesosTest::Shutdown()
> [18:27:36][Step 8/8] @  0x1070ec7  
> mesos::internal::tests::MesosTest::TearDown()
> {code}
> The {{StartSlave}} helper should take {{shared_ptr}} arguments instead.
> This also means that we can remove the {{Shutdown}} helper from most of these 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4621) --disable-optimize triggers optimized builds.

2016-03-19 Thread James Peach (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4621?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197509#comment-15197509
 ] 

James Peach commented on MESOS-4621:


This is one instance of the problem reported in MESOS-2537. Basically, the 
conventional auto tool enable/disable semantics don't work at all.

> --disable-optimize triggers optimized builds.
> -
>
> Key: MESOS-4621
> URL: https://issues.apache.org/jira/browse/MESOS-4621
> Project: Mesos
>  Issue Type: Bug
>Reporter: Till Toenshoff
>Assignee: Yong Tang
>Priority: Minor
>
> The toggle-logic of the build configuration argument {{optimize}} appears to 
> be implemented incorrectly. When using the perfectly legal invocation;
> {noformat}
> ../configure --disable-optimize
> {noformat}
> What you get here is enabled optimizing {{O2}}.
> {noformat}
> ccache g++ -Qunused-arguments -fcolor-diagnostics 
> -DPACKAGE_NAME=\"libprocess\" -DPACKAGE_TARNAME=\"libprocess\" 
> -DPACKAGE_VERSION=\"0.0.1\" -DPACKAGE_STRING=\"libprocess\ 0.0.1\" 
> -DPACKAGE_BUGREPORT=\"\" -DPACKAGE_URL=\"\" -DPACKAGE=\"libprocess\" 
> -DVERSION=\"0.0.1\" -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
> -DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
> -DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
> -DLT_OBJDIR=\".libs/\" -DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 
> -DHAVE_SVN_VERSION_H=1 -DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 
> -DHAVE_LIBSVN_DELTA_1=1 -DHAVE_LIBCURL=1 -DHAVE_PTHREAD_PRIO_INHERIT=1 
> -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBDL=1 -I. 
> -I../../../../3rdparty/libprocess/3rdparty  
> -I../../../../3rdparty/libprocess/3rdparty/stout/include -Iprotobuf-2.5.0/src 
>  -Igmock-1.7.0/gtest/include -Igmock-1.7.0/include -isystem boost-1.53.0 
> -Ipicojson-1.3.0 -DPICOJSON_USE_INT64 -D__STDC_FORMAT_MACROS -Iglog-0.3.3/src 
> -I/usr/local/opt/openssl/include -I/usr/local/opt/libevent/include 
> -I/usr/local/opt/subversion/include/subversion-1 -I/usr/include/apr-1 
> -I/usr/include/apr-1.0   -O2 -Wno-unused-local-typedef -std=c++11 
> -stdlib=libc++ -DGTEST_USE_OWN_TR1_TUPLE=1 -DGTEST_LANG_CXX11 -MT 
> stout_tests-flags_tests.o -MD -MP -MF .deps/stout_tests-flags_tests.Tpo -c -o 
> stout_tests-flags_tests.o `test -f 'stout/tests/flags_tests.cpp' || echo 
> '../../../../3rdparty/libprocess/3rdparty/'`stout/tests/flags_tests.cpp
> {noformat}
> It seems more straightforward to actually disable optimizing for the above 
> argument.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4782) Extend example persistent volume test framework for multiple disks

2016-03-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4782:
---
Summary: Extend example persistent volume test framework for multiple disks 
 (was: Extend persistent volume test framework for multiple disks)

> Extend example persistent volume test framework for multiple disks
> --
>
> Key: MESOS-4782
> URL: https://issues.apache.org/jira/browse/MESOS-4782
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>  Labels: mesosphere, persistent-volumes, test-framework
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4810) ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_ShellCommand fails.

2016-03-19 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4810:

Summary: ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_ShellCommand fails. 
 (was: ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand 
fails.)

> ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_ShellCommand fails.
> --
>
> Key: MESOS-4810
> URL: https://issues.apache.org/jira/browse/MESOS-4810
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.28.0
> Environment: CentOS 7 on AWS, both with or without SSL.
>Reporter: Bernd Mathiske
>Assignee: Jie Yu
>  Labels: docker, mesosphere, test
>
> {noformat}
> [09:46:46] :   [Step 11/11] [ RUN  ] 
> ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.628413  1166 leveldb.cpp:174] 
> Opened db in 4.242882ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629926  1166 leveldb.cpp:181] 
> Compacted db in 1.483621ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629966  1166 leveldb.cpp:196] 
> Created db iterator in 15498ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629977  1166 leveldb.cpp:202] 
> Seeked to beginning of db in 1405ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629984  1166 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 239ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630015  1166 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630470  1183 recover.cpp:447] 
> Starting replica recovery
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630702  1180 recover.cpp:473] 
> Replica is in EMPTY status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.631767  1182 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (14567)@172.30.2.124:37431
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.632115  1183 recover.cpp:193] 
> Received a recover response from a replica in EMPTY status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.632450  1186 recover.cpp:564] 
> Updating replica status to STARTING
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633476  1186 master.cpp:375] 
> Master 3fbb2fb0-4f18-498b-a440-9acbf6923a13 (ip-172-30-2-124.mesosphere.io) 
> started on 172.30.2.124:37431
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633491  1186 master.cpp:377] Flags 
> at startup: --acls="" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate="true" 
> --authenticate_http="true" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/4UxXoW/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/4UxXoW/master" 
> --zk_session_timeout="10secs"
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633677  1186 master.cpp:422] 
> Master only allowing authenticated frameworks to register
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633685  1186 master.cpp:427] 
> Master only allowing authenticated slaves to register
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633692  1186 credentials.hpp:35] 
> Loading credentials for authentication from '/tmp/4UxXoW/credentials'
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633851  1183 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 1.191043ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633873  1183 replica.cpp:320] 
> Persisted replica status to STARTING
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633894  1186 master.cpp:467] Using 
> default 'crammd5' authenticator
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634003  1186 master.cpp:536] Using 
> default 'basic' HTTP authenticator
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634062  1184 recover.cpp:473] 
> Replica is in STARTING status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634109  1186 master.cpp:570] 
> Authorization enabled
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634249  1187 
> whitelist_watcher.cpp:77] No whitelist given
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634255  1184 hierarchical.cpp:144] 
> Initialized hierarchical allocator process
> [09:46:46]W:   [St

[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-19 Thread Ashwin Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199982#comment-15199982
 ] 

Ashwin Murthy commented on MESOS-3902:
--

Should we extend the Flags support when Creating masters in the test framework 
to support starting either as leader or non-leader? The usage in the test 
should be like:

1. leadingMaster = StartMaster(flags_leader)
2. nonLeadingMaster1 = StartMaster(flags_nonleader)
3. nonLeadingMaster2 = StartMaster(flags_nonleader)
3. //perform test on leader and the two non leaders

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200365#comment-15200365
 ] 

Avinash Sridharan commented on MESOS-4823:
--

[~lxpollitt] I completely agree with your observation that port mapping is just 
one of the ways to expose the service to the external world. If the service 
(layer 4 + layer 3) is completely addressable by the underlying CNI network and 
port mapping is not required that is perfectly fine. The only point I am trying 
to make is there are cases where port translation "might" be required by the 
container to make its service accessible, but unfortunately there is no way in 
CNI to communicate this information to the underlying plugin and hence we were 
thinking of implementing this piece in the isolator itself. It is an opt-in 
where the frameworks would specify whether they want to port mapping or not. 

The idea here is that we should not be breaking the CNI spec, but at the same 
time we feel that the spec itself evolving and we should try to compensate for 
the missing pieces. 


Would love to schedule a hangout if you would like to discuss further on this 
and get some closure as to whether this is an acceptable solution to enable 
port mapping in CNI or maybe come up with an alternate solution that does not 
touch the CNI isolator.  

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish to expose ports that micro-services are 
> listening on, to the outside world. When containers are running on bridged 
> (or ptp) networking this can be achieved by installing port forwarding rules 
> on the agent (using iptables). This can be done in the `network/cni` 
> isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4941) Support update existing quota

2016-03-19 Thread Zhitao Li (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhitao Li updated MESOS-4941:
-
Description: 
We want to support updating an existing quota without the cycle of delete and 
recreate. This avoids the possible starvation risk of losing the quota between 
delete and recreate, and also makes the interface friendly.

Design doc:
https://docs.google.com/document/d/1c8fJY9_N0W04FtUQ_b_kZM6S0eePU7eYVyfUP14dSys

  was:We want to support updating an existing quota without the cycle of delete 
and recreate. This avoids the possible starvation risk of losing the quota 
between delete and recreate, and also makes the interface friendly.


Design doc:
https://docs.google.com/document/d/1c8fJY9_N0W04FtUQ_b_kZM6S0eePU7eYVyfUP14dSys

> Support update existing quota
> -
>
> Key: MESOS-4941
> URL: https://issues.apache.org/jira/browse/MESOS-4941
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Zhitao Li
>Assignee: Zhitao Li
>  Labels: Quota
>
> We want to support updating an existing quota without the cycle of delete and 
> recreate. This avoids the possible starvation risk of losing the quota 
> between delete and recreate, and also makes the interface friendly.
> Design doc:
> https://docs.google.com/document/d/1c8fJY9_N0W04FtUQ_b_kZM6S0eePU7eYVyfUP14dSys



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4962) Support for Mesos releases

2016-03-19 Thread Vinod Kone (JIRA)
Vinod Kone created MESOS-4962:
-

 Summary: Support for Mesos releases
 Key: MESOS-4962
 URL: https://issues.apache.org/jira/browse/MESOS-4962
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Vinod Kone


As part of Mesos reaching 1.0, we need to formalize the policy of supporting 
Mesos releases.

Some specific questions we need to answer:

--> What fixes should we backports to older releases.

--> How many old releases are supported.

--> Should we have a LTS version?

--> What is the cadence of major, minor and patch releases?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4612) Update vendored ZooKeeper to 3.4.8

2016-03-19 Thread Chen Zhiwei (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chen Zhiwei updated MESOS-4612:
---
Description: 
See: http://zookeeper.apache.org/doc/r3.4.8/releasenotes.html for improvements 
/ bug fixes

Added a new patch that solved 
[ZOOKEEPER-1643](https://issues.apache.org/jira/browse/ZOOKEEPER-1643)

The original patch: 


  was:See: http://zookeeper.apache.org/doc/r3.4.8/releasenotes.html for 
improvements / bug fixes


> Update vendored ZooKeeper to 3.4.8
> --
>
> Key: MESOS-4612
> URL: https://issues.apache.org/jira/browse/MESOS-4612
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Cody Maloney
>Assignee: Chen Zhiwei
>  Labels: mesosphere, tech-debt, zookeeper
>
> See: http://zookeeper.apache.org/doc/r3.4.8/releasenotes.html for 
> improvements / bug fixes
> Added a new patch that solved 
> [ZOOKEEPER-1643](https://issues.apache.org/jira/browse/ZOOKEEPER-1643)
> The original patch: 
> 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4964) curl fetcher fails to decode chunked encoding

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4964:
--
Labels: mesosphere  (was: )

> curl fetcher fails to decode chunked encoding
> -
>
> Key: MESOS-4964
> URL: https://issues.apache.org/jira/browse/MESOS-4964
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Reporter: James Peach
>  Labels: mesosphere
>
> If the curl-base fetcher gets a HTTP response that is chunked, the HTTP 
> decode fails because the response says it is chunked, but curl is dechunking 
> the body to stdout.
> {code}
> E0316 15:23:31.124482 13299 slave.cpp:3773] Container 
> 'fa06a5ee-637e-480c-b602-59705b707d85' for executor 'jpeach.10489' of 
> framework 96d1191b-cdf0-40f6-8840-e4d4d92a9345-0010 failed to start: Collect 
> failed: Failed to decode HTTP responses: Decoding failed
> HTTP/1.1 400 Bad Request
> Server: nginx/1.9.4
> Date: Wed, 16 Mar 2016 22:23:30 GMT
> Content-Type: application/json
> Transfer-Encoding: chunked
> Connection: keep-alive
> X-Artifactory-Id: ae6c9bffd47ec19a:-61ef0a68:1537a605a05:-8000
> {
>   "errors" : [ {
> "status" : 400,
> "message" : "Unsupported docker v2 repository request for 
> 'docker-registry'"
>   } ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4839) Move placement new processes into the freezer cgroup into a parent hook.

2016-03-19 Thread Joerg Schad (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Schad updated MESOS-4839:
---
Sprint: Mesosphere Sprint 31

> Move placement new processes into the freezer cgroup into a parent hook.
> 
>
> Key: MESOS-4839
> URL: https://issues.apache.org/jira/browse/MESOS-4839
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>
> The Linux Launcher places new processes into the freezer cgroup.
> This is currently done by a combination of childSetup function (blocking the 
> new process until parent is done) and the parent (placing child process into 
> the cgroup and then signaling child to continue).
> ParentHooks support this behavior (blocking child until some work is done in 
> the parent) in a much cleaner way. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4970) Add more examples of JSON resources to docs

2016-03-19 Thread Greg Mann (JIRA)
Greg Mann created MESOS-4970:


 Summary: Add more examples of JSON resources to docs
 Key: MESOS-4970
 URL: https://issues.apache.org/jira/browse/MESOS-4970
 Project: Mesos
  Issue Type: Documentation
  Components: documentation
Reporter: Greg Mann


The configuration documentation currently only shows examples of scalar 
resource types in JSON format. The structures of JSON resources are a bit 
complicated, so it would be very helpful to include examples of ranges, sets, 
and text resource types as well.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-19 Thread Joseph Wu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199966#comment-15199966
 ] 

Joseph Wu commented on MESOS-3902:
--

Multi-master tests are still not possible :(
See: https://issues.apache.org/jira/browse/MESOS-2976

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4976) Reject RESERVE on revocable resources

2016-03-19 Thread Klaus Ma (JIRA)
Klaus Ma created MESOS-4976:
---

 Summary: Reject RESERVE on revocable resources
 Key: MESOS-4976
 URL: https://issues.apache.org/jira/browse/MESOS-4976
 Project: Mesos
  Issue Type: Bug
  Components: master
Reporter: Klaus Ma


In {{Resources::apply}}, we did not check whether the resources is revocable or 
not. It does not make sense to reserve a revocable resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4964) curl based docker fetcher fails to decode chunked encoding

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4964:
--
Shepherd: Timothy Chen

> curl based docker fetcher fails to decode chunked encoding
> --
>
> Key: MESOS-4964
> URL: https://issues.apache.org/jira/browse/MESOS-4964
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Reporter: James Peach
>Assignee: Jie Yu
>  Labels: mesosphere
>
> If the curl-base fetcher gets a HTTP response that is chunked, the HTTP 
> decode fails because the response says it is chunked, but curl is dechunking 
> the body to stdout.
> {code}
> E0316 15:23:31.124482 13299 slave.cpp:3773] Container 
> 'fa06a5ee-637e-480c-b602-59705b707d85' for executor 'jpeach.10489' of 
> framework 96d1191b-cdf0-40f6-8840-e4d4d92a9345-0010 failed to start: Collect 
> failed: Failed to decode HTTP responses: Decoding failed
> HTTP/1.1 400 Bad Request
> Server: nginx/1.9.4
> Date: Wed, 16 Mar 2016 22:23:30 GMT
> Content-Type: application/json
> Transfer-Encoding: chunked
> Connection: keep-alive
> X-Artifactory-Id: ae6c9bffd47ec19a:-61ef0a68:1537a605a05:-8000
> {
>   "errors" : [ {
> "status" : 400,
> "message" : "Unsupported docker v2 repository request for 
> 'docker-registry'"
>   } ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197927#comment-15197927
 ] 

Avinash Sridharan edited comment on MESOS-4823 at 3/16/16 6:58 PM:
---

[~djosborne] interesting point. I guess the ticket is a bit misleading. The 
fact that containers are addressable (layer 3 addressable) doesn't mean that 
their IP addresses are globally routeable. The idea hear was to provide NAT 
capability, along with the ability for the containers to specify the ports (if 
requireD) on which they want to expose their service. While the CNI spec allows 
the IP masquerade option to be specified, it doesn't specify any mechanisms to 
specify port forwarding rules. This is particularly essential to support any 
EXPOSE primitives specified by the images (as with docker's EXPOSE primitives). 

I have raised this issue in the cni-dev mailing list as well, and it seems like 
there are other folks that are interesting in tis requirement as well 
https://groups.google.com/forum/#!topic/cni-dev/FW3BCFJwAxY

and it does seem like there are other folks interested in port forwarding and 
firewalling rules to be part of the CNI spec. Currently however this is not the 
case and hence we will need to support it in the isolator. 





was (Author: avin...@mesosphere.io):
[~djosborne] interesting point. I guess maybe the ticket is a bit misleading. 
The fact that containers are addressable (layer 3 addressable) doesn't mean 
that their IP addresses are globally routeable. The idea hear was to provide 
NAT capability, along with the ability for the containers to specify the ports 
(if requireD) on which they want to expose their service. While the CNI spec 
allows the IP masquerade option to be specified, it doesn't specify any 
mechanisms to specify port forwarding rules. This is particularly essential to 
support any EXPOSE primitives specified by the images (as with docker's EXPOSE 
primitives). 

I have raised this issue in the cni-dev mailing list as well, and it seems like 
there are other folks that are interesting in tis requirement as well 
https://groups.google.com/forum/#!topic/cni-dev/FW3BCFJwAxY

and it does seem like there are other folks interested in port forwarding and 
firewalling rules to be part of the CNI spec. Currently however this is not the 
case and hence we will need to support it in the isolator. 




> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4978) Update mesos-execute with Appc changes.

2016-03-19 Thread Jojy Varghese (JIRA)
Jojy Varghese created MESOS-4978:


 Summary: Update mesos-execute with Appc changes.
 Key: MESOS-4978
 URL: https://issues.apache.org/jira/browse/MESOS-4978
 Project: Mesos
  Issue Type: Bug
  Components: containerization
Reporter: Jojy Varghese
Assignee: Jojy Varghese


mesos-execute cli application currently does not have support for Appc images. 
Adding support would make integration tests easier.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-19 Thread Ashwin Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15196772#comment-15196772
 ] 

Ashwin Murthy edited comment on MESOS-3902 at 3/17/16 4:10 AM:
---

I tried to repro this in our production mesos cluster. I see the following:

ashwinm@mgmt01-sjc1:~$ curl -v -X POST -H "Content-Type: application/json" 
--data @body.json http:///api/v1/scheduler 
* About to connect() * port 5050 (#0)
*   Trying 10.162.29.25... connected
> POST /api/v1/scheduler HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.2d 
> zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> Host: :5050
> Accept: */*
> Content-Type: application/json
> Content-Length: 120
> 
* upload completely sent off: 120out of 120 bytes
< HTTP/1.1 307 Temporary Redirect
< Date: Wed, 16 Mar 2016 05:01:13 GMT
< Location: //:5050
< Content-Length: 0
< 
* Connection #0 to host left intact
* Closing connection #0



was (Author: ashwinmurthy):
I tried to repro this in our production mesos cluster. I see the following:

ashwinm@mgmt01-sjc1:~$ curl -v -X POST -H "Content-Type: application/json" 
--data @body.json 
http://compute34-sjc1.prod.uber.internal:5050/api/v1/scheduler 
* About to connect() to compute34-sjc1.prod.uber.internal port 5050 (#0)
*   Trying 10.162.29.25... connected
> POST /api/v1/scheduler HTTP/1.1
> User-Agent: curl/7.22.0 (x86_64-pc-linux-gnu) libcurl/7.22.0 OpenSSL/1.0.2d 
> zlib/1.2.3.4 libidn/1.23 librtmp/2.3
> Host: compute34-sjc1.prod.uber.internal:5050
> Accept: */*
> Content-Type: application/json
> Content-Length: 120
> 
* upload completely sent off: 120out of 120 bytes
< HTTP/1.1 307 Temporary Redirect
< Date: Wed, 16 Mar 2016 05:01:13 GMT
< Location: //compute35-sjc1.prod.uber.internal:5050
< Content-Length: 0
< 
* Connection #0 to host compute34-sjc1.prod.uber.internal left intact
* Closing connection #0


> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4974) mesos-execute should allow setting command_uris

2016-03-19 Thread Jian Qiu (JIRA)
Jian Qiu created MESOS-4974:
---

 Summary: mesos-execute should allow setting command_uris
 Key: MESOS-4974
 URL: https://issues.apache.org/jira/browse/MESOS-4974
 Project: Mesos
  Issue Type: Bug
  Components: cli
Reporter: Jian Qiu
Priority: Minor


Based on discussion in MESOS-4744, it will be helpful to let mesos-execute 
support setting uris in command info.

We can add a flag:
{code}
--uris=uri1,uri2..
{code} 
and set other values in CommandInfo::URIS as default.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3559) Make the Command Scheduler use the HTTP Scheduler Library

2016-03-19 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar reassigned MESOS-3559:
-

Assignee: Anand Mazumdar  (was: Guangya Liu)

> Make the Command Scheduler use the HTTP Scheduler Library
> -
>
> Key: MESOS-3559
> URL: https://issues.apache.org/jira/browse/MESOS-3559
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> We should make the Command Scheduler in {{src/cli/executor.cpp}} use the 
> Scheduler Library {{src/scheduler/scheduler.cpp}} instead of the Scheduler 
> Driver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4878:
--
Affects Version/s: 0.28.0

> Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
> 
>
> Key: MESOS-4878
> URL: https://issues.apache.org/jira/browse/MESOS-4878
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.27.0, 0.27.1, 0.28.0
>Reporter: Shuai Lin
>Assignee: Shuai Lin
>
> When a task is launched with the mesos containerizer and a docker image, if 
> the docker fetcher failed to pull the image, no more task updates are sent to 
> the scheduler.
> {code}
> I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image 
> 'alpine:latest' from 
> 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to 
> '/tmp/mesos-test/store/docker/staging/V2dqJv'
> E0306 17:29:00.749889 17651 slave.cpp:3773] Container 
> '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of 
> framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect 
> failed: Unexpected HTTP response '401 Unauthorized' when trying to get the 
> manifest
> I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container 
> '6b98026b-a58d-434c-9432-b517012edc35'
> I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators 
> to complete preparing before destroying the container
> I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor 
> ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73
> {code}
> Scheduler logs:
> {code}
> sudo ./build/src/mesos-execute --docker_image=alpine:latest 
> --containerizer=mesos --name=just-a-test --command="sleep 1000" 
> --master=33.33.33.33:5050
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> W0306 17:28:57.491081 17740 sched.cpp:1642] 
> **
> Scheduler driver bound to loopback interface! Cannot communicate with remote 
> master(s). You might want to set 'LIBPROCESS_IP' environment variable to use 
> a routable IP address.
> **
> I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0
> I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at 
> master@33.33.33.33:5050
> I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. 
> Attempting to register without authentication
> I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with 
> a4ff93ba-2141-48e2-92a9-7354e4028282-
> Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282-
> task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4932) Propose Design for Authorization based filtering for endpoints.

2016-03-19 Thread Kevin Klues (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201830#comment-15201830
 ] 

Kevin Klues commented on MESOS-4932:


I'd probably use the blanket term "role" to refer to the combined 
framework/task roles (or any other category of roles we come up with), and then 
use the category name when we need to get specific.  That way we can talk about 
a role consisting of framework roles and task roles, etc.

> Propose Design for Authorization based filtering for endpoints.
> ---
>
> Key: MESOS-4932
> URL: https://issues.apache.org/jira/browse/MESOS-4932
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197927#comment-15197927
 ] 

Avinash Sridharan edited comment on MESOS-4823 at 3/16/16 7:00 PM:
---

[~djosborne] interesting point. I guess the ticket is a bit misleading. The 
fact that containers are addressable (layer 3 addressable) doesn't mean that 
their IP addresses are globally routeable. The idea hear was to provide NAT 
capability, along with the ability for the containers to specify the ports (if 
requireD) on which they want to expose their service. While the CNI spec allows 
the IP masquerade option to be specified, it doesn't specify any mechanisms to 
specify port forwarding rules. This is particularly essential to support any 
EXPOSE primitives specified by the images (as with docker's EXPOSE primitives). 

I have raised this issue in the cni-dev mailing list as well,
https://groups.google.com/forum/#!topic/cni-dev/FW3BCFJwAxY

and it does seem like there are other folks interested in port forwarding and 
firewall rules to be part of the CNI spec. Currently, however, this is not the 
case and hence we will need to support it in the isolator. 





was (Author: avin...@mesosphere.io):
[~djosborne] interesting point. I guess the ticket is a bit misleading. The 
fact that containers are addressable (layer 3 addressable) doesn't mean that 
their IP addresses are globally routeable. The idea hear was to provide NAT 
capability, along with the ability for the containers to specify the ports (if 
requireD) on which they want to expose their service. While the CNI spec allows 
the IP masquerade option to be specified, it doesn't specify any mechanisms to 
specify port forwarding rules. This is particularly essential to support any 
EXPOSE primitives specified by the images (as with docker's EXPOSE primitives). 

I have raised this issue in the cni-dev mailing list as well, and it seems like 
there are other folks that are interesting in tis requirement as well 
https://groups.google.com/forum/#!topic/cni-dev/FW3BCFJwAxY

and it does seem like there are other folks interested in port forwarding and 
firewalling rules to be part of the CNI spec. Currently however this is not the 
case and hence we will need to support it in the isolator. 




> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4828) XFS disk quota isolator

2016-03-19 Thread James Peach (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4828?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

James Peach updated MESOS-4828:
---
Description: 
Implement a disk resource isolator using XFS project quotas. Compared to the 
{{posix/disk}} isolator, this doesn't need to scan the filesystem periodically, 
and applications receive a {{EDQUOT}} error instead of being summarily killed.

This initial implementation only isolates sandbox directory resources, since 
isolation doesn't have any visibility into the the lifecycle of volumes, which 
is needed to assign and track project IDs.

The build dependencies for this are XFS header (from xfsprogs-devel) and 
libblkid. We need libblkid or the equivalent to map filesystem paths to block 
devices in order to apply quota.

  was:
Implement a disk resource isolator using XFS project quotas. Compared to the 
{{posix/disk}} isolator, this doesn't need to scan the filesystem periodically, 
and applications receive a {{ENOSPC}} error instead of being summarily killed.

This initial implementation only isolates sandbox directory resources, since 
isolation doesn't have any visibility into the the lifecycle of volumes, which 
is needed to assign and track project IDs.

The build dependencies for this are XFS header (from xfsprogs-devel) and 
libblkid. We need libblkid or the equivalent to map filesystem paths to block 
devices in order to apply quota.


> XFS disk quota isolator
> ---
>
> Key: MESOS-4828
> URL: https://issues.apache.org/jira/browse/MESOS-4828
> Project: Mesos
>  Issue Type: Improvement
>  Components: isolation
>Reporter: James Peach
>Assignee: James Peach
>
> Implement a disk resource isolator using XFS project quotas. Compared to the 
> {{posix/disk}} isolator, this doesn't need to scan the filesystem 
> periodically, and applications receive a {{EDQUOT}} error instead of being 
> summarily killed.
> This initial implementation only isolates sandbox directory resources, since 
> isolation doesn't have any visibility into the the lifecycle of volumes, 
> which is needed to assign and track project IDs.
> The build dependencies for this are XFS header (from xfsprogs-devel) and 
> libblkid. We need libblkid or the equivalent to map filesystem paths to block 
> devices in order to apply quota.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4634) Tests will dereference stack allocated master objects upon assertion/expectation failure.

2016-03-19 Thread Michael Park (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Park updated MESOS-4634:

Shepherd: Michael Park  (was: Bernd Mathiske)

> Tests will dereference stack allocated master objects upon 
> assertion/expectation failure.
> -
>
> Key: MESOS-4634
> URL: https://issues.apache.org/jira/browse/MESOS-4634
> Project: Mesos
>  Issue Type: Bug
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: flaky, mesosphere, tech-debt, test
> Fix For: 0.29.0
>
>
> Tests that use the {{StartMaster}} test helper are generally fragile when the 
> test fails an assert/expect in the middle of the test.  This is because the 
> {{StartMaster}} helper takes raw pointer arguments, which may be 
> stack-allocated.
> In case of an assert failure, the test immediately exits (destroying stack 
> allocated objects) and proceeds onto test cleanup.  The test cleanup may 
> dereference some of these destroyed objects, leading to a test crash like:
> {code}
> [18:27:36][Step 8/8] F0204 18:27:35.981302 23085 logging.cpp:64] RAW: Pure 
> virtual method called
> [18:27:36][Step 8/8] @ 0x7f7077055e1c  google::LogMessage::Fail()
> [18:27:36][Step 8/8] @ 0x7f707705ba6f  google::RawLog__()
> [18:27:36][Step 8/8] @ 0x7f70760f76c9  __cxa_pure_virtual
> [18:27:36][Step 8/8] @   0xa9423c  
> mesos::internal::tests::Cluster::Slaves::shutdown()
> [18:27:36][Step 8/8] @  0x1074e45  
> mesos::internal::tests::MesosTest::ShutdownSlaves()
> [18:27:36][Step 8/8] @  0x1074de4  
> mesos::internal::tests::MesosTest::Shutdown()
> [18:27:36][Step 8/8] @  0x1070ec7  
> mesos::internal::tests::MesosTest::TearDown()
> {code}
> The {{StartMaster}} helper should take {{shared_ptr}} arguments instead.
> This also means that we can remove the {{Shutdown}} helper from most of these 
> tests.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-19 Thread Ashwin Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197445#comment-15197445
 ] 

Ashwin Murthy commented on MESOS-3902:
--

ok, yes that was what I indicated. scheme and path from the original request 
should part of this. Straightforward fix, contained inside the 
master:http:redirect

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4953) DockerFetcherPluginTest.INTERNET_CURL_FetchManifest is flaky

2016-03-19 Thread Anand Mazumdar (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197856#comment-15197856
 ] 

Anand Mazumdar commented on MESOS-4953:
---

My bad, I pasted the wrong test in the title. Updated.

> DockerFetcherPluginTest.INTERNET_CURL_FetchManifest is flaky
> 
>
> Key: MESOS-4953
> URL: https://issues.apache.org/jira/browse/MESOS-4953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, fetcher
>Reporter: Anand Mazumdar
>  Labels: flaky, flaky-test, mesosphere
>
> This test fails quite regularly on my linux box. Relevant verbose logs:
> {code}
> [ RUN  ] DockerFetcherPluginTest.INTERNET_CURL_FetchManifest
> E0315 17:28:59.233052 25940 shell.hpp:106] Command 'hadoop version 2>&1' 
> failed; this is the output:
> sh: 1: hadoop: not found
> E0315 17:28:59.233104 25940 fetcher.cpp:59] Failed to create URI fetcher 
> plugin 'hadoop': Failed to create HDFS client: Failed to execute 'hadoop 
> version 2>&1'; the command was either not found or exited with a non-zero 
> exit status: 127
> ../../src/tests/uri_fetcher_tests.cpp:230: Failure
> Failed to wait 1mins for fetcher.get()->fetch(uri, dir)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-19 Thread Ashwin Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199976#comment-15199976
 ] 

Ashwin Murthy commented on MESOS-3902:
--

ok, sounds good. Sending the review. 

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4953) DockerFetcherPluginTest.INTERNET_CURL_FetchManifest is flaky

2016-03-19 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-4953:
--
Summary: DockerFetcherPluginTest.INTERNET_CURL_FetchManifest is flaky  
(was: DockerFetcherPluginTest.INTERNET_CURL_FetchImage is flaky)

> DockerFetcherPluginTest.INTERNET_CURL_FetchManifest is flaky
> 
>
> Key: MESOS-4953
> URL: https://issues.apache.org/jira/browse/MESOS-4953
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, fetcher
>Reporter: Anand Mazumdar
>  Labels: flaky, flaky-test, mesosphere
>
> This test fails quite regularly on my linux box. Relevant verbose logs:
> {code}
> [ RUN  ] DockerFetcherPluginTest.INTERNET_CURL_FetchManifest
> E0315 17:28:59.233052 25940 shell.hpp:106] Command 'hadoop version 2>&1' 
> failed; this is the output:
> sh: 1: hadoop: not found
> E0315 17:28:59.233104 25940 fetcher.cpp:59] Failed to create URI fetcher 
> plugin 'hadoop': Failed to create HDFS client: Failed to execute 'hadoop 
> version 2>&1'; the command was either not found or exited with a non-zero 
> exit status: 127
> ../../src/tests/uri_fetcher_tests.cpp:230: Failure
> Failed to wait 1mins for fetcher.get()->fetch(uri, dir)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3573) Mesos does not kill orphaned docker containers

2016-03-19 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3573?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar reassigned MESOS-3573:
-

Assignee: Anand Mazumdar

> Mesos does not kill orphaned docker containers
> --
>
> Key: MESOS-3573
> URL: https://issues.apache.org/jira/browse/MESOS-3573
> Project: Mesos
>  Issue Type: Bug
>  Components: docker, slave
>Reporter: Ian Babrou
>Assignee: Anand Mazumdar
>  Labels: mesosphere
>
> After upgrade to 0.24.0 we noticed hanging containers appearing. Looks like 
> there were changes between 0.23.0 and 0.24.0 that broke cleanup.
> Here's how to trigger this bug:
> 1. Deploy app in docker container.
> 2. Kill corresponding mesos-docker-executor process
> 3. Observe hanging container
> Here are the logs after kill:
> {noformat}
> slave_1| I1002 12:12:59.362002  7791 docker.cpp:1576] Executor for 
> container 'f083aaa2-d5c3-43c1-b6ba-342de8829fa8' has exited
> slave_1| I1002 12:12:59.362284  7791 docker.cpp:1374] Destroying 
> container 'f083aaa2-d5c3-43c1-b6ba-342de8829fa8'
> slave_1| I1002 12:12:59.363404  7791 docker.cpp:1478] Running docker stop 
> on container 'f083aaa2-d5c3-43c1-b6ba-342de8829fa8'
> slave_1| I1002 12:12:59.363876  7791 slave.cpp:3399] Executor 
> 'sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c' of framework 
> 20150923-122130-2153451692-5050-1- terminated with signal Terminated
> slave_1| I1002 12:12:59.367570  7791 slave.cpp:2696] Handling status 
> update TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- from @0.0.0.0:0
> slave_1| I1002 12:12:59.367842  7791 slave.cpp:5094] Terminating task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c
> slave_1| W1002 12:12:59.368484  7791 docker.cpp:986] Ignoring updating 
> unknown container: f083aaa2-d5c3-43c1-b6ba-342de8829fa8
> slave_1| I1002 12:12:59.368671  7791 status_update_manager.cpp:322] 
> Received status update TASK_FAILED (UUID: 
> 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1-
> slave_1| I1002 12:12:59.368741  7791 status_update_manager.cpp:826] 
> Checkpointing UPDATE for status update TASK_FAILED (UUID: 
> 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1-
> slave_1| I1002 12:12:59.370636  7791 status_update_manager.cpp:376] 
> Forwarding update TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) 
> for task sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- to the slave
> slave_1| I1002 12:12:59.371335  7791 slave.cpp:2975] Forwarding the 
> update TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- to master@172.16.91.128:5050
> slave_1| I1002 12:12:59.371908  7791 slave.cpp:2899] Status update 
> manager successfully handled status update TASK_FAILED (UUID: 
> 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1-
> master_1   | I1002 12:12:59.37204711 master.cpp:4069] Status update 
> TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- from slave 
> 20151002-120829-2153451692-5050-1-S0 at slave(1)@172.16.91.128:5051 
> (172.16.91.128)
> master_1   | I1002 12:12:59.37253411 master.cpp:4108] Forwarding status 
> update TASK_FAILED (UUID: 4a1b2387-a469-4f01-bfcb-0d1cccbde550) for task 
> sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1-
> master_1   | I1002 12:12:59.37301811 master.cpp:5576] Updating the latest 
> state of task sleepy.87eb6191-68fe-11e5-9444-8eb895523b9c of framework 
> 20150923-122130-2153451692-5050-1- to TASK_FAILED
> master_1   | I1002 12:12:59.37344711 hierarchical.hpp:814] Recovered 
> cpus(*):0.1; mem(*):16; ports(*):[31685-31685] (total: cpus(*):4; 
> mem(*):1001; disk(*):52869; ports(*):[31000-32000], allocated: 
> cpus(*):8.32667e-17) on slave 20151002-120829-2153451692-5050-1-S0 from 
> framework 20150923-122130-2153451692-5050-1-
> {noformat}
> Another issue: if you restart mesos-slave on the host with orphaned docker 
> containers, they are not getting killed. This was the case before and I hoped 
> for this trick to kill hanging containers, but it doesn't work now.
> Marking this as critical because it hoards cluster resources and blocks 
> scheduli

[jira] [Created] (MESOS-4973) Duplicates in 'unregistered_frameworks' in /state

2016-03-19 Thread Yan Xu (JIRA)
Yan Xu created MESOS-4973:
-

 Summary: Duplicates in 'unregistered_frameworks' in /state 
 Key: MESOS-4973
 URL: https://issues.apache.org/jira/browse/MESOS-4973
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu


In our clusters where many frameworks run, 'unregistered_frameworks' currently 
doesn't show what it semantically means, but rather "a list of frameworkIDs for 
each orphaned task", which means a lot of duplicated frameworkIDs.

For this filed to be useful we need to deduplicate when outputting the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198323#comment-15198323
 ] 

Avinash Sridharan commented on MESOS-4823:
--

But how would the CNI isolator know if the underlying plugin has the capability 
or not? Or for that matter the parameters that it needs to pass to the plugin?

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish to expose ports that micro-services are 
> listening on, to the outside world. When containers are running on bridged 
> (or ptp) networking this can be achieved by installing port forwarding rules 
> on the agent (using iptables). This can be done in the `network/cni` 
> isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3368) Add device support in cgroups abstraction

2016-03-19 Thread Abhishek Dasgupta (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200225#comment-15200225
 ] 

Abhishek Dasgupta commented on MESOS-3368:
--

Previous reviews have been discarded:

New reviews are -
https://reviews.apache.org/r/44974/
https://reviews.apache.org/r/44975/

> Add device support in cgroups abstraction
> -
>
> Key: MESOS-3368
> URL: https://issues.apache.org/jira/browse/MESOS-3368
> Project: Mesos
>  Issue Type: Task
>Reporter: Niklas Quarfot Nielsen
>Assignee: Abhishek Dasgupta
>
> Add support for [device 
> cgroups|https://www.kernel.org/doc/Documentation/cgroup-v1/devices.txt] to 
> aid isolators controlling access to devices.
> In the future, we could think about how to numerate and control access to 
> devices as resource or task/container policy



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-19 Thread Ashwin Murthy (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199915#comment-15199915
 ] 

Ashwin Murthy commented on MESOS-3902:
--

The fix is ready, I am trying to add a test for this,  I couldnt find any 
redirect related tests as yet. Should I add a test to scheduler http tests?

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4964) curl based docker fetcher fails to decode chunked encoding

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4964:
--
Fix Version/s: 0.29.0

> curl based docker fetcher fails to decode chunked encoding
> --
>
> Key: MESOS-4964
> URL: https://issues.apache.org/jira/browse/MESOS-4964
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 0.28.0
>Reporter: James Peach
>Assignee: Jie Yu
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> If the curl-base fetcher gets a HTTP response that is chunked, the HTTP 
> decode fails because the response says it is chunked, but curl is dechunking 
> the body to stdout.
> {code}
> E0316 15:23:31.124482 13299 slave.cpp:3773] Container 
> 'fa06a5ee-637e-480c-b602-59705b707d85' for executor 'jpeach.10489' of 
> framework 96d1191b-cdf0-40f6-8840-e4d4d92a9345-0010 failed to start: Collect 
> failed: Failed to decode HTTP responses: Decoding failed
> HTTP/1.1 400 Bad Request
> Server: nginx/1.9.4
> Date: Wed, 16 Mar 2016 22:23:30 GMT
> Content-Type: application/json
> Transfer-Encoding: chunked
> Connection: keep-alive
> X-Artifactory-Id: ae6c9bffd47ec19a:-61ef0a68:1537a605a05:-8000
> {
>   "errors" : [ {
> "status" : 400,
> "message" : "Unsupported docker v2 repository request for 
> 'docker-registry'"
>   } ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4983) Segfault in ProcessTest.Spawn with GCC 6

2016-03-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4983:
---
Description: 
{{ProcessTest.Spawn}} fails deterministically for me with GCC 6 and 
{{--enable-optimize}}. Recent Arch Linux, GCC "6.0.0 20160227".

{noformat}
[ RUN  ] ProcessTest.Spawn
*** Aborted at 145817 (unix time) try "date -d @145817" if you are 
using GNU date ***
PC: @   0x522926 SpawnProcess::initialize()
*** SIGSEGV (@0x0) received by PID 11359 (TID 0x7faa6075f700) from PID 0; stack 
trace: ***
@ 0x7faa670dbe80 (unknown)
@   0x522926 SpawnProcess::initialize()
@   0x646fa6 process::ProcessManager::resume()
@   0x6471ff 
_ZNSt6thread11_State_implISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt6atomicIbEE_St17reference_wrapperIS7_EEEvEEE6_M_runEv
@ 0x7faa6764a812 execute_native_thread_routine
@ 0x7faa670d2424 start_thread
@ 0x7faa65b04cbd __clone
@0x0 (unknown)
Makefile:1748: recipe for target 'check-local' failed
make[5]: *** [check-local] Segmentation fault (core dumped)
{noformat}

Backtrace:

{noformat}
Program terminated with signal SIGSEGV, Segmentation fault.
#0  testing::internal::ActionResultHolder::GetValueAndDelete (this=0x0) 
at 3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1373
1373  void GetValueAndDelete() const { delete this; }
[Current thread is 1 (Thread 0x7faa6075f700 (LWP 11365))]
(gdb) bt
#0  testing::internal::ActionResultHolder::GetValueAndDelete (this=0x0) 
at 3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1373
#1  testing::internal::FunctionMockerBase::InvokeWith(std::tuple<> 
const&) (args=empty std::tuple, this=0x712a7c88) at 
3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1530
#2  testing::internal::FunctionMocker::Invoke() (this=0x712a7c88) 
at 3rdparty/gmock-1.7.0/include/gmock/gmock-generated-function-mockers.h:76
#3  SpawnProcess::initialize (this=0x712a7c80) at 
/mesos-2/3rdparty/libprocess/src/tests/process_tests.cpp:113
#4  0x00646fa6 in process::ProcessManager::resume (this=0x25a2b60, 
process=0x712a7d38) at /mesos-2/3rdparty/libprocess/src/process.cpp:2504
#5  0x006471ff in process::ProcessManageroperator() (__closure=, joining=...) at 
/mesos-2/3rdparty/libprocess/src/process.cpp:2218
#6  std::_Bind(std::reference_wrapper >)>::__call (__args=, this=) at 
/home/vagrant/local/gcc/include/c++/6.0.0/functional:943
#7  std::_Bind(std::reference_wrapper >)>::operator()<> 
(this=) at 
/home/vagrant/local/gcc/include/c++/6.0.0/functional:1002
#8  
std::_Bind_simple(std::reference_wrapper 
>)>()>::_M_invoke<> (this=) at 
/home/vagrant/local/gcc/include/c++/6.0.0/functional:1400
#9  
std::_Bind_simple(std::reference_wrapper 
>)>()>::operator() (this=) at 
/home/vagrant/local/gcc/include/c++/6.0.0/functional:1389
#10 
std::thread::_State_impl(std::reference_wrapper >)>()> 
>::_M_run(void) (this=) at 
/home/vagrant/local/gcc/include/c++/6.0.0/thread:196
#11 0x7faa6764a812 in std::(anonymous 
namespace)::execute_native_thread_routine (__p=0x25a3bf0) at 
../../../../../gcc-trunk/libstdc++-v3/src/c++11/thread.cc:83
#12 0x7faa670d2424 in start_thread () from /usr/lib/libpthread.so.0
#13 0x7faa65b04cbd in clone () from /usr/lib/libc.so.6
{noformat}

  was:
{{ProcessTest.Spawn}} fails deterministically for me with GCC 6 and 
{{--enable-optimize}}.

{noformat}
[ RUN  ] ProcessTest.Spawn
*** Aborted at 145817 (unix time) try "date -d @145817" if you are 
using GNU date ***
PC: @   0x522926 SpawnProcess::initialize()
*** SIGSEGV (@0x0) received by PID 11359 (TID 0x7faa6075f700) from PID 0; stack 
trace: ***
@ 0x7faa670dbe80 (unknown)
@   0x522926 SpawnProcess::initialize()
@   0x646fa6 process::ProcessManager::resume()
@   0x6471ff 
_ZNSt6thread11_State_implISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt6atomicIbEE_St17reference_wrapperIS7_EEEvEEE6_M_runEv
@ 0x7faa6764a812 execute_native_thread_routine
@ 0x7faa670d2424 start_thread
@ 0x7faa65b04cbd __clone
@0x0 (unknown)
Makefile:1748: recipe for target 'check-local' failed
make[5]: *** [check-local] Segmentation fault (core dumped)
{noformat}

Backtrace:

{noformat}
Program terminated with signal SIGSEGV, Segmentation fault.
#0  testing::internal::ActionResultHolder::GetValueAndDelete (this=0x0) 
at 3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1373
1373  void GetValueAndDelete() const { delete this; }
[Current thread is 1 (Thread 0x7faa6075f700 (LWP 11365))]
(gdb) bt
#0  testing::internal::ActionResultHolder::GetValueAndDelete (this=0x0) 
at 3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1373
#1  testing::internal::FunctionMockerBase::InvokeWith(std::tuple<> 

[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197927#comment-15197927
 ] 

Avinash Sridharan commented on MESOS-4823:
--

[~djosborne] interesting point. I guess maybe the ticket is a bit misleading. 
The fact that containers are addressable (layer 3 addressable) doesn't mean 
that their IP addresses are globally routeable. The idea hear was to provide 
NAT capability, along with the ability for the containers to specify the ports 
(if requireD) on which they want to expose their service. While the CNI spec 
allows the IP masquerade option to be specified, it doesn't specify any 
mechanisms to specify port forwarding rules. This is particularly essential to 
support any EXPOSE primitives specified by the images (as with docker's EXPOSE 
primitives). 

I have raised this issue in the cni-dev mailing list as well, and it seems like 
there are other folks that are interesting in tis requirement as well 
https://groups.google.com/forum/#!topic/cni-dev/FW3BCFJwAxY

and it does seem like there are other folks interested in port forwarding and 
firewalling rules to be part of the CNI spec. Currently however this is not the 
case and hence we will need to support it in the isolator. 




> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4093) Unify namespace order in mesos code

2016-03-19 Thread Yong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Tang reassigned MESOS-4093:


Assignee: Yong Tang

> Unify namespace order in mesos code
> ---
>
> Key: MESOS-4093
> URL: https://issues.apache.org/jira/browse/MESOS-4093
> Project: Mesos
>  Issue Type: Bug
>Reporter: Guangya Liu
>Assignee: Yong Tang
>
> This is from code review for https://reviews.apache.org/r/40995/
> There is no rule for where to put std namespace.
> Style 1)
> {code}
> using process::Clock;
> using process::Future;
> using process::Message;
> using process::PID;
> using std::vector;
> {code}
> Style 2)
> {code}
> using std::string;
> using std::vector;
> using google::protobuf::RepeatedPtrField;
> using mesos::internal::master::Master;
> using mesos::internal::slave::Slave;
> using mesos::quota::QuotaInfo;
> using process::Future;
> using process::PID;
> using process::http::BadRequest;
> using process::http::Conflict;
> using process::http::OK;
> using process::http::Response;
> {code}
> I think that we should always follow style 2) to make sure putting std at the 
> beginning as it is system library.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Avinash Sridharan updated MESOS-4823:
-
Description: 
Most docker and appc images wish to expose ports that micro-services are 
listening on, to the outside world. When containers are running on bridged (or 
ptp) networking this can be achieved by installing port forwarding rules on the 
agent (using iptables). This can be done in the `network/cni` isolator. 

The reason we would like this functionality to be implemented in the 
`network/cni` isolator, and not a CNI plugin, is that the specifications 
currently do not support specifying port forwarding rules. Further, to install 
these rules the isolator needs two pieces of information, the exposed ports and 
the IP address associated with the container. Bother are available to the 
isolator.

  was:
Most docker and appc images wish ports that micro-services are listening on, to 
the outside world. When containers are running on bridged (or ptp) networking 
this can be achieved by installing port forwarding rules on the agent (using 
iptables). This can be done in the `network/cni` isolator. 

The reason we would like this functionality to be implemented in the 
`network/cni` isolator, and not a CNI plugin, is that the specifications 
currently do not support specifying port forwarding rules. Further, to install 
these rules the isolator needs two pieces of information, the exposed ports and 
the IP address associated with the container. Bother are available to the 
isolator.


> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish to expose ports that micro-services are 
> listening on, to the outside world. When containers are running on bridged 
> (or ptp) networking this can be achieved by installing port forwarding rules 
> on the agent (using iptables). This can be done in the `network/cni` 
> isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4744) mesos-execute should allow setting role

2016-03-19 Thread Jian Qiu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jian Qiu updated MESOS-4744:

Description: It will be quite useful if we can set role when running 
mesos-execute  (was: It will be quite useful if we can set role and command 
uris when running mesos-execute)

> mesos-execute should allow setting role
> ---
>
> Key: MESOS-4744
> URL: https://issues.apache.org/jira/browse/MESOS-4744
> Project: Mesos
>  Issue Type: Bug
>  Components: cli
>Reporter: Jian Qiu
>Assignee: Jian Qiu
>Priority: Minor
>
> It will be quite useful if we can set role when running mesos-execute



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4973) Duplicates in 'unregistered_frameworks' in /state

2016-03-19 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-4973:
--
Priority: Minor  (was: Major)

> Duplicates in 'unregistered_frameworks' in /state 
> --
>
> Key: MESOS-4973
> URL: https://issues.apache.org/jira/browse/MESOS-4973
> Project: Mesos
>  Issue Type: Bug
>Reporter: Yan Xu
>Priority: Minor
>
> In our clusters where many frameworks run, 'unregistered_frameworks' 
> currently doesn't show what it semantically means, but rather "a list of 
> frameworkIDs for each orphaned task", which means a lot of duplicated 
> frameworkIDs.
> For this filed to be useful we need to deduplicate when outputting the list.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4794) Add documentation around using the docker containerizer on CentOS 6.

2016-03-19 Thread Joseph Wu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joseph Wu updated MESOS-4794:
-
Sprint:   (was: Mesosphere Sprint 31)

> Add documentation around using the docker containerizer on CentOS 6.
> 
>
> Key: MESOS-4794
> URL: https://issues.apache.org/jira/browse/MESOS-4794
> Project: Mesos
>  Issue Type: Documentation
>  Components: docker, documentation
>Affects Versions: 0.28.0
>Reporter: Joseph Wu
>Assignee: Joseph Wu
>  Labels: containerizer, docker, documentation, mesosphere
>
> Support for persistent volumes was added to the docker containerizer in 
> [MESOS-3413].  However, this does not work on CentOS 6.
> On CentOS 6, the same {{docker run -v ...}} operation does not perform a 
> recursive bind, whereas on every other OS supported by Mesos, docker does a 
> recursive bind.
> Docker has already [dropped support for CentOS 
> 6|https://github.com/docker/docker/issues/14365], so we should add 
> precautionary documentation in case anyone tries to use the docker 
> containerizer on CentOS 6.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4977) Sometime Cmd":["-c","echo 'No such file or directory'] in task.

2016-03-19 Thread SERGEY GALKIN (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201767#comment-15201767
 ] 

SERGEY GALKIN edited comment on MESOS-4977 at 3/18/16 4:55 PM:
---

Mesos Slaves HW (189 nodes)

HP ProLiant DL380 Gen9,
CPU - 2 x Intel(R) Xeon(R) CPU E5-2680 v3 @2.50GHz (48 cores (with 
hyperthreading))
RAM - 264G,
Storage - 3.0T on RAID on HP Smart Array P840 Controller,
HDD - 12 x HP EH0600JDYTL
Network - 2 x Intel Corporation Ethernet 10G2P 
X710,



was (Author: sergeygals):
Mesos Slaves HW

HP ProLiant DL380 Gen9,
CPU - 2 x Intel(R) Xeon(R) CPU E5-2680 v3 @2.50GHz (48 cores (with 
hyperthreading))
RAM - 264G,
Storage - 3.0T on RAID on HP Smart Array P840 Controller,
HDD - 12 x HP EH0600JDYTL
Network - 2 x Intel Corporation Ethernet 10G2P 
X710,


> Sometime Cmd":["-c","echo 'No such file or directory'] in task.
> ---
>
> Key: MESOS-4977
> URL: https://issues.apache.org/jira/browse/MESOS-4977
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.2
> Environment: 189 mesos slaves on Ubuntu 14.04.3 LTS
>Reporter: SERGEY GALKIN
>
> mesos - 0.27.0
> marathon - 0.15.2
> I am trying to launch 1 simple docker application with nginx with 500 
> instances on cluster with 189 HW nodes through Marathon
> {code}
> ID /1f532267a08494e3081c1acb42d273b7
> Command Unspecified
> Constraints Unspecified
> Dependencies Unspecified
> Labels Unspecified
> Resource Roles Unspecified
> Container
> {
>   "type": "DOCKER",
>   "volumes": [],
>   "docker": {
> "image": "nginx",
> "network": "BRIDGE",
> "portMappings": [
>   {
> "containerPort": 80,
> "hostPort": 0,
> "servicePort": 1,
> "protocol": "tcp"
>   }
> ],
> "privileged": false,
> "parameters": [],
> "forcePullImage": false
>   }
> }
> CPUs 1
> Environment Unspecified
> Executor Unspecified
> Health Checks 
> [
>   {
> "path": "/",
> "protocol": "HTTP",
> "portIndex": 0,
> "gracePeriodSeconds": 300,
> "intervalSeconds": 60,
> "timeoutSeconds": 20,
> "maxConsecutiveFailures": 3,
> "ignoreHttp1xx": false
>   }
> ]
> Instances 500
> IP Address Unspecified
> Memory 256 MiB
> Disk Space 50 MiB
> Ports 1
> Backoff Factor 1.15
> Backoff 1 seconds
> Max Launch Delay 3600 seconds
> URIs Unspecified
> User Unspecified
> {code}
> Deployment stopped on Delayed, only about 360-370 of 500 instances are 
> successful. In the stdout in the failed mesos tasks I see "No such file or 
> directory"
> As I see in /var/log/upstarе/docker.log with enabled debug mesos sometimes 
> try to start containers with strange Cmd ("Cmd":["-c","echo 'No such file or 
> directory'; exit 1"]) and this task failed. Sometime everything is ok 
> "Cmd":null and task in RUNNING state
> Part of the log available in http://paste.openstack.org/show/491122/
> I successfully started 700 nginx with docker applications with 10 instances 
> simultaneously in this cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Issue Comment Deleted] (MESOS-4976) Reject RESERVE on revocable resources

2016-03-19 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-4976:

Comment: was deleted

(was: Good point. I think it's not necessary.)

> Reject RESERVE on revocable resources
> -
>
> Key: MESOS-4976
> URL: https://issues.apache.org/jira/browse/MESOS-4976
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Klaus Ma
>
> In {{Resources::apply}}, we did not check whether the resources is revocable 
> or not. It does not make sense to reserve a revocable resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200265#comment-15200265
 ] 

Avinash Sridharan commented on MESOS-4823:
--

CNI, the way its defined today, takes care of layer 3 addressing. So when you 
say getting in and out of the CNI network, I agree that the layer 3 forwarding 
primitives should be taken care of by the underlying plugin, and the isolator 
should be agnostic of these mechanism. The problem here is that the 
service/container is addressable using a combination of layer3 and layer 4 
(TCP/UDP port). The piece that CNI is missing and we are trying to address i to 
expose the layer 4 address of the container to the outside world on an as 
needed basis. 

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish to expose ports that micro-services are 
> listening on, to the outside world. When containers are running on bridged 
> (or ptp) networking this can be achieved by installing port forwarding rules 
> on the agent (using iptables). This can be done in the `network/cni` 
> isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198319#comment-15198319
 ] 

Avinash Sridharan commented on MESOS-4823:
--

But how would the CNI isolator know if the underlying plugin has the capability 
or not? Or for that matter the parameters that it needs to pass to the plugin?

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4955) Generize perf event parsing to match PerfStatistics filed name for "perf stat"

2016-03-19 Thread Fan Du (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199148#comment-15199148
 ] 

Fan Du commented on MESOS-4955:
---

Really sweet, this is exactly what I need.
thanks for the point.

> Generize perf event parsing to match PerfStatistics filed name for "perf stat"
> --
>
> Key: MESOS-4955
> URL: https://issues.apache.org/jira/browse/MESOS-4955
> Project: Mesos
>  Issue Type: Improvement
>  Components: isolation
>Reporter: Fan Du
>Assignee: Fan Du
>
> Current 
> [design|https://git-wip-us.apache.org/repos/asf?p=mesos.git;a=blob;f=include/mesos/mesos.proto;h=deb9c0910a27afd67276f54b3f666a878212727b;hb=HEAD#l981]
>  does not support event like:
> {{SUBSYS/EVENT  <- Most notable intel_cqm/llc_occupancy/}}
> {{SUSSYS:EVENT  <- All Tracepoint event}}
> This gap could be fulfilled with a bit by matching EVENT with PerfStatistics 
> Proto Message name.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4718) Add allocator metric for number of completed allocation runs

2016-03-19 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-4718:

Shepherd: Benjamin Mahler

> Add allocator metric for number of completed allocation runs
> 
>
> Key: MESOS-4718
> URL: https://issues.apache.org/jira/browse/MESOS-4718
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation
>Reporter: Benjamin Bannier
>Assignee: Benjamin Bannier
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4969) improve overlayfs detection

2016-03-19 Thread Guangya Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200796#comment-15200796
 ] 

Guangya Liu commented on MESOS-4969:


I was trying to introduce {{overlay::supported}} in 
https://reviews.apache.org/r/44421/diff/3#index_header , I think that this may 
be a good solution for this and what I need to update my patch is that update 
the logic of {{overlay::supported}} to use {{modprobe -q overlay}} to check if 
the overlayfs is enabled or not, [~xujyan] [~haosd...@gmail.com] what do you 
say?

> improve overlayfs detection
> ---
>
> Key: MESOS-4969
> URL: https://issues.apache.org/jira/browse/MESOS-4969
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, volumes
>Reporter: James Peach
>Priority: Minor
>
> On my Fedora 23, overlayfs is a module that is not loaded by default 
> (attempting to mount an overlayfs automatically triggers the module loading). 
> However {{mesos-slave}} won't start until I manually load the module since it 
> is not listed in {{/proc/filesystems}} until is it loaded.
> It would be nice if there was a more reliable way to determine overlayfs 
> support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4960) Implement clang-tidy checks for correct usage of spawn/terminate

2016-03-19 Thread Benjamin Bannier (JIRA)
Benjamin Bannier created MESOS-4960:
---

 Summary: Implement clang-tidy checks for correct usage of 
spawn/terminate
 Key: MESOS-4960
 URL: https://issues.apache.org/jira/browse/MESOS-4960
 Project: Mesos
  Issue Type: Improvement
Reporter: Benjamin Bannier


The use of libprocess' {{spawn}} requires care, e.g.,

* if a process is {{spawn}}'ed as unmanaged it should always explicitly be 
{{terminate}}'ed at some later point,
* if a process is {{spawn}}'ed as managed the process must not be 
stack-allocated.

We should add clang static analysis to ensure correct usage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4963) Compile error with GCC 6

2016-03-19 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199892#comment-15199892
 ] 

Neil Conway commented on MESOS-4963:


Ah, good catch. What is going on here is that GCC6 defaults to {{-std=gnu++14}} 
(previously versions defaulted to {{-std=gnu++98}}). For some reason I haven't 
investigated, the code in question compiles with {{-std}} set to {{c++11}} and 
{{c++14}}, but not {{gnu++11}} or {{gnu++14}}. We likely don't want to use 
{{gnu++14}} anyway.

> Compile error with GCC 6
> 
>
> Key: MESOS-4963
> URL: https://issues.apache.org/jira/browse/MESOS-4963
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Neil Conway
>  Labels: mesosphere
>
> {noformat}
> $ head config.log
> [...]
> /mesos-2/configure --enable-optimize --disable-python CC=ccache 
> /home/vagrant/local/gcc/bin/gcc CXX=ccache /home/vagrant/local/gcc/bin/g++
> $ ~/local/gcc/bin/g++ --version
> g++ (GCC) 6.0.0 20160227 (experimental)
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $ make V=0
> make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
>   CXX  appc/libmesos_no_3rdparty_la-spec.lo
> In file included from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
>  from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
>  from /mesos-2/src/appc/spec.cpp:17:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
> In instantiation of ‘int os::execlp(const char*, T ...) [with T = {const 
> char*, const char*, const char*, char*}]’:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/fork.hpp:371:52:
>required from here
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp:151:18:
>  error: missing sentinel in function call [-Werror=format=]
>return ::execlp(file, t...);
>   ^~~~
> cc1plus: all warnings being treated as errors
> Makefile:5584: recipe for target 'appc/libmesos_no_3rdparty_la-spec.lo' failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4810) ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_ShellCommand fails.

2016-03-19 Thread Jie Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200164#comment-15200164
 ] 

Jie Yu commented on MESOS-4810:
---

Looks like the problem is that on some linux distribution, '/bin' is not under 
$PATH when some shell is used. Since the container image 'alpine' itself does 
not specify environment variables, $PATH will be inherit from the agent. As a 
result, when we exec, the exec cannot find 'sh' because it's under /bin in 
alpine, but '/bin' is not under $PATH.

> ProvisionerDockerPullerTest.ROOT_INTERNET_CURL_ShellCommand fails.
> --
>
> Key: MESOS-4810
> URL: https://issues.apache.org/jira/browse/MESOS-4810
> Project: Mesos
>  Issue Type: Bug
>  Components: docker
>Affects Versions: 0.28.0
> Environment: CentOS 7 on AWS, both with or without SSL.
>Reporter: Bernd Mathiske
>Assignee: Jie Yu
>  Labels: docker, mesosphere, test
>
> {noformat}
> [09:46:46] :   [Step 11/11] [ RUN  ] 
> ProvisionerDockerRegistryPullerTest.ROOT_INTERNET_CURL_ShellCommand
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.628413  1166 leveldb.cpp:174] 
> Opened db in 4.242882ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629926  1166 leveldb.cpp:181] 
> Compacted db in 1.483621ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629966  1166 leveldb.cpp:196] 
> Created db iterator in 15498ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629977  1166 leveldb.cpp:202] 
> Seeked to beginning of db in 1405ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.629984  1166 leveldb.cpp:271] 
> Iterated through 0 keys in the db in 239ns
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630015  1166 replica.cpp:779] 
> Replica recovered with log positions 0 -> 0 with 1 holes and 0 unlearned
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630470  1183 recover.cpp:447] 
> Starting replica recovery
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.630702  1180 recover.cpp:473] 
> Replica is in EMPTY status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.631767  1182 replica.cpp:673] 
> Replica in EMPTY status received a broadcasted recover request from 
> (14567)@172.30.2.124:37431
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.632115  1183 recover.cpp:193] 
> Received a recover response from a replica in EMPTY status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.632450  1186 recover.cpp:564] 
> Updating replica status to STARTING
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633476  1186 master.cpp:375] 
> Master 3fbb2fb0-4f18-498b-a440-9acbf6923a13 (ip-172-30-2-124.mesosphere.io) 
> started on 172.30.2.124:37431
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633491  1186 master.cpp:377] Flags 
> at startup: --acls="" --allocation_interval="1secs" 
> --allocator="HierarchicalDRF" --authenticate="true" 
> --authenticate_http="true" --authenticate_slaves="true" 
> --authenticators="crammd5" --authorizers="local" 
> --credentials="/tmp/4UxXoW/credentials" --framework_sorter="drf" 
> --help="false" --hostname_lookup="true" --http_authenticators="basic" 
> --initialize_driver_logging="true" --log_auto_initialize="true" 
> --logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
> --max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
> --quiet="false" --recovery_slave_removal_limit="100%" 
> --registry="replicated_log" --registry_fetch_timeout="1mins" 
> --registry_store_timeout="100secs" --registry_strict="true" 
> --root_submissions="true" --slave_ping_timeout="15secs" 
> --slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
> --webui_dir="/usr/local/share/mesos/webui" --work_dir="/tmp/4UxXoW/master" 
> --zk_session_timeout="10secs"
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633677  1186 master.cpp:422] 
> Master only allowing authenticated frameworks to register
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633685  1186 master.cpp:427] 
> Master only allowing authenticated slaves to register
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633692  1186 credentials.hpp:35] 
> Loading credentials for authentication from '/tmp/4UxXoW/credentials'
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633851  1183 leveldb.cpp:304] 
> Persisting metadata (8 bytes) to leveldb took 1.191043ms
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633873  1183 replica.cpp:320] 
> Persisted replica status to STARTING
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.633894  1186 master.cpp:467] Using 
> default 'crammd5' authenticator
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634003  1186 master.cpp:536] Using 
> default 'basic' HTTP authenticator
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634062  1184 recover.cpp:473] 
> Replica is in STARTING status
> [09:46:46]W:   [Step 11/11] I0229 09:46:46.634109  1186 master.cpp:570] 
> Authorization enabled
> [09:46:4

[jira] [Updated] (MESOS-4877) Mesos containerizer can't handle top level docker image like "alpine" (must use "library/alpine")

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4877?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4877:
--
Fix Version/s: 0.28.1

> Mesos containerizer can't handle top level docker image like "alpine" (must 
> use "library/alpine")
> -
>
> Key: MESOS-4877
> URL: https://issues.apache.org/jira/browse/MESOS-4877
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.27.0, 0.27.1
>Reporter: Shuai Lin
>Assignee: Gilbert Song
> Fix For: 0.28.1
>
>
> This can be demonstrated with the {{mesos-execute}} command:
> # Docker containerizer with image {{alpine}}: success
> {code}
> sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=docker 
> --name=just-a-test --command="sleep 1000" --master=localhost:5050
> {code}
> # Mesos containerizer with image {{alpine}}: failure
> {code}
> sudo ./build/src/mesos-execute --docker_image=alpine --containerizer=mesos 
> --name=just-a-test --command="sleep 1000" --master=localhost:5050
> {code}
> # Mesos containerizer with image {{library/alpine}}: success
> {code}
> sudo ./build/src/mesos-execute --docker_image=library/alpine 
> --containerizer=mesos --name=just-a-test --command="sleep 1000" 
> --master=localhost:5050
> {code}
> In the slave logs:
> {code}
> ea-4460-83
> 9c-838da86af34c-0007'
> I0306 16:32:41.418269  3403 metadata_manager.cpp:159] Looking for image 
> 'alpine:latest'
> I0306 16:32:41.418699  3403 registry_puller.cpp:194] Pulling image 
> 'alpine:latest' from 
> 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to 
> '/tmp/mesos-test
> /store/docker/staging/ka7MlQ'
> E0306 16:32:43.098131  3400 slave.cpp:3773] Container 
> '4bf9132d-9a57-4baa-a78c-e7164e93ace6' for executor 'just-a-test' of 
> framework 4f055c6f-1bea-4460-839c-838da86af34c-0
> 007 failed to start: Collect failed: Unexpected HTTP response '401 
> Unauthorized
> {code}
> curl command executed:
> {code}
> $ sudo sysdig -A -p "*%evt.time %proc.cmdline" evt.type=execve and 
> proc.name=curl
>16:42:53.198998042 curl -s -S -L -D - 
> https://registry-1.docker.io:443/v2/alpine/manifests/latest
> 16:42:53.784958541 curl -s -S -L -D - 
> https://auth.docker.io/token?service=registry.docker.io&scope=repository:alpine:pull
> 16:42:54.294192024 curl -s -S -L -D - -H Authorization: Bearer 
> eyJhbGciOiJFUzI1NiIsInR5cCI6IkpXVCIsIng1YyI6WyJNSUlDTHpDQ0FkU2dBd0lCQWdJQkFEQUtCZ2dxaGtqT1BRUURBakJHTVVRd1FnWURWUVFERXp0Uk5Gb3pPa2RYTjBrNldGUlFSRHBJVFRSUk9rOVVWRmc2TmtGRlF6cFNUVE5ET2tGU01rTTZUMFkzTnpwQ1ZrVkJPa2xHUlVrNlExazFTekFlRncweE5UQTJNalV4T1RVMU5EWmFGdzB4TmpBMk1qUXhPVFUxTkRaYU1FWXhSREJDQmdOVkJBTVRPMGhHU1UwNldGZFZWam8yUVZkSU9sWlpUVEk2TTFnMVREcFNWREkxT2s5VFNrbzZTMVExUmpwWVRsSklPbFJMTmtnNlMxUkxOanBCUVV0VU1Ga3dFd1lIS29aSXpqMENBUVlJS29aSXpqMERBUWNEUWdBRXl2UzIvdEI3T3JlMkVxcGRDeFdtS1NqV1N2VmJ2TWUrWGVFTUNVMDByQjI0akNiUVhreFdmOSs0MUxQMlZNQ29BK0RMRkIwVjBGZGdwajlOWU5rL2pxT0JzakNCcnpBT0JnTlZIUThCQWY4RUJBTUNBSUF3RHdZRFZSMGxCQWd3QmdZRVZSMGxBREJFQmdOVkhRNEVQUVE3U0VaSlRUcFlWMVZXT2paQlYwZzZWbGxOTWpveldEVk1PbEpVTWpVNlQxTktTanBMVkRWR09saE9Va2c2VkVzMlNEcExWRXMyT2tGQlMxUXdSZ1lEVlIwakJEOHdQWUE3VVRSYU16cEhWemRKT2xoVVVFUTZTRTAwVVRwUFZGUllPalpCUlVNNlVrMHpRenBCVWpKRE9rOUdOemM2UWxaRlFUcEpSa1ZKT2tOWk5Vc3dDZ1lJS29aSXpqMEVBd0lEU1FBd1JnSWhBTXZiT2h4cHhrTktqSDRhMFBNS0lFdXRmTjZtRDFvMWs4ZEJOVGxuWVFudkFpRUF0YVJGSGJSR2o4ZlVSSzZ4UVJHRURvQm1ZZ3dZelR3Z3BMaGJBZzNOUmFvPSJdfQ.eyJhY2Nlc3MiOltdLCJhdWQiOiJyZWdpc3RyeS5kb2NrZXIuaW8iLCJleHAiOjE0NTcyODI4NzQsImlhdCI6MTQ1NzI4MjU3NCwiaXNzIjoiYXV0aC5kb2NrZXIuaW8iLCJqdGkiOiJaOGtyNXZXNEJMWkNIRS1IcVJIaCIsIm5iZiI6MTQ1NzI4MjU3NCwic3ViIjoiIn0.C2wtJq_P-m0buPARhmQjDfh6ztIAhcvgN3tfWIZEClSgXlVQ_sAQXAALNZKwAQL2Chj7NpHX--0GW-aeL_28Aw
>  https://registry-1.docker.io:443/v2/alpine/manifests/latest
> {code}
> Also got the same result with {{ubuntu}} docker image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4964) curl based docker fetcher fails to decode chunked encoding

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4964:
--
Fix Version/s: (was: 0.29.0)
   0.28.1

> curl based docker fetcher fails to decode chunked encoding
> --
>
> Key: MESOS-4964
> URL: https://issues.apache.org/jira/browse/MESOS-4964
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 0.28.0
>Reporter: James Peach
>Assignee: Jie Yu
>  Labels: mesosphere
> Fix For: 0.28.1
>
>
> If the curl-base fetcher gets a HTTP response that is chunked, the HTTP 
> decode fails because the response says it is chunked, but curl is dechunking 
> the body to stdout.
> {code}
> E0316 15:23:31.124482 13299 slave.cpp:3773] Container 
> 'fa06a5ee-637e-480c-b602-59705b707d85' for executor 'jpeach.10489' of 
> framework 96d1191b-cdf0-40f6-8840-e4d4d92a9345-0010 failed to start: Collect 
> failed: Failed to decode HTTP responses: Decoding failed
> HTTP/1.1 400 Bad Request
> Server: nginx/1.9.4
> Date: Wed, 16 Mar 2016 22:23:30 GMT
> Content-Type: application/json
> Transfer-Encoding: chunked
> Connection: keep-alive
> X-Artifactory-Id: ae6c9bffd47ec19a:-61ef0a68:1537a605a05:-8000
> {
>   "errors" : [ {
> "status" : 400,
> "message" : "Unsupported docker v2 repository request for 
> 'docker-registry'"
>   } ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4697) Consolidate cgroup isolators into one single isolator.

2016-03-19 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-4697:

Description: 
There are two motivations for this:
1) It's very verbose to add a new isolator. For cgroup isolators (e.g., cpu, 
mem, net_cls, etc.), many of the logics are the same. We are currently 
duplicating a lot of the code.
2) Initially, we decided to use a separate isolator for each cgroup subsystem 
is because we want each subsystem to be mounted under a different hierarchy. 
This gradually become not true with unified cgroup hierarchy introduced in 
kernel 3.16([The unified control group hierarchy in 
3.16|https://lwn.net/Articles/601840/], 
[cgroup-v2|https://github.com/torvalds/linux/blob/master/Documentation/cgroup-v2.txt|]).
 Also, on some popular linux distributions, some subsystems are co-mounted 
within the same hierarchy (e.g., net_cls and net_prio, cpu and cpuacct). It 
becomes very hard to co-manage a hierarchy by two isolators.

We can still introduce subsystem specific code under the unified cgroup 
isolator by introduce a Subsystem abstraction.

  was:
Linux introduce the unified cgroup hierarchy since 3.16 [The unified control 
group hierarchy in 3.16|https://lwn.net/Articles/601840/], 
[cgroup-v2|https://github.com/torvalds/linux/blob/master/Documentation/cgroup-v2.txt|]

There are two motivations for this:
1) It's very verbose to add a new isolator. For cgroup isolators (e.g., cpu, 
mem, net_cls, etc.), many of the logics are the same. We are currently 
duplicating a lot of the code.
2) Initially, we decided to use a separate isolator for each cgroup subsystem 
is because we want each subsystem to be mounted under a different hierarchy. 
This gradually become not true with unified cgroup hierarchy introduced in 
kernel 3.16. Also, on some popular linux distributions, some subsystems are 
co-mounted within the same hierarchy (e.g., net_cls and net_prio, cpu and 
cpuacct). It becomes very hard to co-manage a hierarchy by two isolators.

We can still introduce subsystem specific code under the unified cgroup 
isolator (e.g., introduce a Subsystem abstraction?).


> Consolidate cgroup isolators into one single isolator.
> --
>
> Key: MESOS-4697
> URL: https://issues.apache.org/jira/browse/MESOS-4697
> Project: Mesos
>  Issue Type: Epic
>Reporter: Jie Yu
>Assignee: haosdent
> Attachments: cgroup_v2.pdf
>
>
> There are two motivations for this:
> 1) It's very verbose to add a new isolator. For cgroup isolators (e.g., cpu, 
> mem, net_cls, etc.), many of the logics are the same. We are currently 
> duplicating a lot of the code.
> 2) Initially, we decided to use a separate isolator for each cgroup subsystem 
> is because we want each subsystem to be mounted under a different hierarchy. 
> This gradually become not true with unified cgroup hierarchy introduced in 
> kernel 3.16([The unified control group hierarchy in 
> 3.16|https://lwn.net/Articles/601840/], 
> [cgroup-v2|https://github.com/torvalds/linux/blob/master/Documentation/cgroup-v2.txt|]).
>  Also, on some popular linux distributions, some subsystems are co-mounted 
> within the same hierarchy (e.g., net_cls and net_prio, cpu and cpuacct). It 
> becomes very hard to co-manage a hierarchy by two isolators.
> We can still introduce subsystem specific code under the unified cgroup 
> isolator by introduce a Subsystem abstraction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4954) URI fetcher error message if plugin is not found is mis-leading.

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4954:
--
Shepherd: Jie Yu

> URI fetcher error message if plugin is not found is mis-leading.
> 
>
> Key: MESOS-4954
> URL: https://issues.apache.org/jira/browse/MESOS-4954
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Reporter: Anand Mazumdar
>Assignee: Yong Tang
>  Labels: newbie
> Fix For: 0.29.0
>
>
> In {{src/uri/fetcher.cpp}}, if we are unable to create a plugin, we skip it 
> but we log an erroneous misleading message:
> {code}
>   // NOTE: We skip the plugin if it cannot be created, instead of
>   // returning an Error so that we can still use other plugins.
>   LOG(ERROR) << "Failed to create URI fetcher plugin "
>  << "'"  << name << "': " << plugin.error();
> {code}
> Ideally, it should be at best a {{LOG(INFO)}} with it clearly specifying that 
> the relevant plugin was skipped since it was not found.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4949) Executor shutdown grace period should be configurable.

2016-03-19 Thread Alexander Rukletsov (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15195382#comment-15195382
 ] 

Alexander Rukletsov edited comment on MESOS-4949 at 3/17/16 11:48 PM:
--

https://reviews.apache.org/r/44655/
https://reviews.apache.org/r/44854/
https://reviews.apache.org/r/44994/


was (Author: alexr):
https://reviews.apache.org/r/44655/
https://reviews.apache.org/r/44854/

> Executor shutdown grace period should be configurable.
> --
>
> Key: MESOS-4949
> URL: https://issues.apache.org/jira/browse/MESOS-4949
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: mesosphere
>
> Currently, executor shutdown grace period is specified by an agent flag, 
> which is propagated to executors via the 
> {{MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD}} environment variable. There is no 
> way to adjust this timeout for the needs of a particular executor.
> To tackle this problem, we propose to introduce an optional 
> {{shutdown_grace_period}} field in {{ExecutorInfo}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4982) Create a long running HTTP based framework

2016-03-19 Thread Anand Mazumdar (JIRA)
Anand Mazumdar created MESOS-4982:
-

 Summary: Create a long running HTTP based framework
 Key: MESOS-4982
 URL: https://issues.apache.org/jira/browse/MESOS-4982
 Project: Mesos
  Issue Type: Task
Reporter: Anand Mazumdar


We need a long running test framework similar to 
{{src/examples/long_lived_framework.cpp}} that uses the v1 Scheduler API.

This would allow us to vet the v1 API and the scheduler library in test 
clusters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4610) MasterContender/MasterDetector should be loadable as modules

2016-03-19 Thread ANURAG SINGH (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197665#comment-15197665
 ] 

ANURAG SINGH commented on MESOS-4610:
-

We've proceeded further with the review and here's the updated list of changes:

https://reviews.apache.org/r/44287/
https://reviews.apache.org/r/44288/
https://reviews.apache.org/r/44543/
https://reviews.apache.org/r/44544/
https://reviews.apache.org/r/44545/
https://reviews.apache.org/r/44546/
https://reviews.apache.org/r/44547/
https://reviews.apache.org/r/44289/
https://reviews.apache.org/r/44669/
https://reviews.apache.org/r/44670/

> MasterContender/MasterDetector should be loadable as modules
> 
>
> Key: MESOS-4610
> URL: https://issues.apache.org/jira/browse/MESOS-4610
> Project: Mesos
>  Issue Type: Improvement
>  Components: master
>Reporter: Mark Cavage
>Assignee: Mark Cavage
>
> Currently mesos depends on Zookeeper for leader election and notification to 
> slaves, although there is a C++ hierarchy in the code to support alternatives 
> (e.g., unit tests use an in-memory implementation). From an operational 
> perspective, many organizations/users do not want to take a dependency on 
> Zookeeper, and use an alternative solution to implementing leader election. 
> Our organization in particular, very much wants this, and as a reference 
> there have been several requests from the community (see referenced tickets) 
> to replace with etcd/consul/etc.
> This ticket will serve as the work effort to modularize the 
> MasterContender/MasterDetector APIs such that integrators can build a 
> pluggable solution of their choice; this ticket will not fold in any 
> implementations such as etcd et al., but simply move this hierarchy to be 
> fully pluggable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4969) improve overlayfs detection

2016-03-19 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200046#comment-15200046
 ] 

haosdent edited comment on MESOS-4969 at 3/17/16 5:58 PM:
--

how about use {{lsmod}} or {{/proc/modules}} to detect this? CentOS 7 is same 
on this.


was (Author: haosd...@gmail.com):
how about use {{lsmod}} to detect this? CentOS 7 is same on this.

> improve overlayfs detection
> ---
>
> Key: MESOS-4969
> URL: https://issues.apache.org/jira/browse/MESOS-4969
> Project: Mesos
>  Issue Type: Bug
>  Components: isolation, volumes
>Reporter: James Peach
>Priority: Minor
>
> On my Fedora 23, overlayfs is a module that is not loaded by default 
> (attempting to mount an overlayfs automatically triggers the module loading). 
> However {{mesos-slave}} won't start until I manually load the module since it 
> is not listed in {{/proc/filesystems}} until is it loaded.
> It would be nice if there was a more reliable way to determine overlayfs 
> support.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4070) numify() handles negative numbers inconsistently.

2016-03-19 Thread Yong Tang (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yong Tang reassigned MESOS-4070:


Assignee: Yong Tang

> numify() handles negative numbers inconsistently.
> -
>
> Key: MESOS-4070
> URL: https://issues.apache.org/jira/browse/MESOS-4070
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Jie Yu
>Assignee: Yong Tang
>  Labels: tech-debt
>
> As pointed by [~neilc] in this review:
> https://reviews.apache.org/r/40988
> {noformat}
> Try num2 = numify("-10");
> EXPECT_SOME_EQ(-10, num2);
> // TODO(neilc): This is inconsistent with the handling of non-hex numbers.
> EXPECT_ERROR(numify("-0x10"));
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2154) Port CFS quota support to Docker Containerizer

2016-03-19 Thread Steve Niemitz (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197523#comment-15197523
 ] 

Steve Niemitz commented on MESOS-2154:
--

Over the life of an executor, tasks can be started and stopped.  When that 
happens the resource quota of the (docker) container the executor is running in 
changes and needs to be updated.

The same thing happens with (and is handled by) normal mesos cgroup containers 
as well.

> Port CFS quota support to Docker Containerizer
> --
>
> Key: MESOS-2154
> URL: https://issues.apache.org/jira/browse/MESOS-2154
> Project: Mesos
>  Issue Type: Improvement
>  Components: docker, isolation
>Affects Versions: 0.21.0
> Environment: Linux (Ubuntu 14.04.1)
>Reporter: Andrew Ortman
>Assignee: haosdent
>Priority: Minor
>
> Port the CFS quota support the Mesos Containerizer has to the Docker 
> Containerizer. Whenever the --cgroup_enable_cfs flag is set, the Docker 
> Containerizer should update the cfs_period_us and cfs_quota_us values to 
> allow hard CPU capping on the container. 
> Current workaround is to pass those values as LXC configuration parameters



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4961) ContainerLoggerTest.LOGROTATE_RotateInSandbox is flaky

2016-03-19 Thread Joseph Wu (JIRA)
Joseph Wu created MESOS-4961:


 Summary: ContainerLoggerTest.LOGROTATE_RotateInSandbox is flaky
 Key: MESOS-4961
 URL: https://issues.apache.org/jira/browse/MESOS-4961
 Project: Mesos
  Issue Type: Bug
 Environment: Seen on ASF CI (Ubuntu 14 + GCC)
Reporter: Joseph Wu


The logger subprocesses may exit before we reach the {{waitpid}} in the test.  
If this happens, {{waitpid}} will return a {{-1}} as the process no longer 
exists.

Verbose logs:
{code}
[ RUN  ] ContainerLoggerTest.LOGROTATE_RotateInSandbox
I0316 14:28:51.329337  1242 cluster.cpp:139] Creating default 'local' authorizer
I0316 14:28:51.332823  1242 leveldb.cpp:174] Opened db in 3.079559ms
I0316 14:28:51.333916  1242 leveldb.cpp:181] Compacted db in 1.054247ms
I0316 14:28:51.333979  1242 leveldb.cpp:196] Created db iterator in 21450ns
I0316 14:28:51.334005  1242 leveldb.cpp:202] Seeked to beginning of db in 2205ns
I0316 14:28:51.334025  1242 leveldb.cpp:271] Iterated through 0 keys in the db 
in 410ns
I0316 14:28:51.334089  1242 replica.cpp:779] Replica recovered with log 
positions 0 -> 0 with 1 holes and 0 unlearned
I0316 14:28:51.334661  1275 recover.cpp:447] Starting replica recovery
I0316 14:28:51.335044  1275 recover.cpp:473] Replica is in EMPTY status
I0316 14:28:51.336207  1262 replica.cpp:673] Replica in EMPTY status received a 
broadcasted recover request from (484)@172.17.0.3:45919
I0316 14:28:51.336730  1270 recover.cpp:193] Received a recover response from a 
replica in EMPTY status
I0316 14:28:51.337257  1275 recover.cpp:564] Updating replica status to STARTING
I0316 14:28:51.338001  1267 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 537200ns
I0316 14:28:51.338032  1267 replica.cpp:320] Persisted replica status to 
STARTING
I0316 14:28:51.338183  1261 master.cpp:376] Master 
c7653f60-33e9-4406-9f62-dc74c906bf83 (2cbb23302fe5) started on 172.17.0.3:45919
I0316 14:28:51.338295  1263 recover.cpp:473] Replica is in STARTING status
I0316 14:28:51.338213  1261 master.cpp:378] Flags at startup: --acls="" 
--allocation_interval="1secs" --allocator="HierarchicalDRF" 
--authenticate="true" --authenticate_http="true" --authenticate_slaves="true" 
--authenticators="crammd5" --authorizers="local" 
--credentials="/tmp/XtqwkS/credentials" --framework_sorter="drf" --help="false" 
--hostname_lookup="true" --http_authenticators="basic" 
--initialize_driver_logging="true" --log_auto_initialize="true" 
--logbufsecs="0" --logging_level="INFO" --max_completed_frameworks="50" 
--max_completed_tasks_per_framework="1000" --max_slave_ping_timeouts="5" 
--quiet="false" --recovery_slave_removal_limit="100%" 
--registry="replicated_log" --registry_fetch_timeout="1mins" 
--registry_store_timeout="100secs" --registry_strict="true" 
--root_submissions="true" --slave_ping_timeout="15secs" 
--slave_reregister_timeout="10mins" --user_sorter="drf" --version="false" 
--webui_dir="/mesos/mesos-0.29.0/_inst/share/mesos/webui" 
--work_dir="/tmp/XtqwkS/master" --zk_session_timeout="10secs"
I0316 14:28:51.338562  1261 master.cpp:423] Master only allowing authenticated 
frameworks to register
I0316 14:28:51.338572  1261 master.cpp:428] Master only allowing authenticated 
slaves to register
I0316 14:28:51.338580  1261 credentials.hpp:35] Loading credentials for 
authentication from '/tmp/XtqwkS/credentials'
I0316 14:28:51.338877  1261 master.cpp:468] Using default 'crammd5' 
authenticator
I0316 14:28:51.339030  1262 replica.cpp:673] Replica in STARTING status 
received a broadcasted recover request from (485)@172.17.0.3:45919
I0316 14:28:51.339246  1261 master.cpp:537] Using default 'basic' HTTP 
authenticator
I0316 14:28:51.339393  1261 master.cpp:571] Authorization enabled
I0316 14:28:51.339390  1266 recover.cpp:193] Received a recover response from a 
replica in STARTING status
I0316 14:28:51.339606  1271 whitelist_watcher.cpp:77] No whitelist given
I0316 14:28:51.339607  1275 hierarchical.cpp:144] Initialized hierarchical 
allocator process
I0316 14:28:51.340077  1268 recover.cpp:564] Updating replica status to VOTING
I0316 14:28:51.340533  1270 leveldb.cpp:304] Persisting metadata (8 bytes) to 
leveldb took 331558ns
I0316 14:28:51.340558  1270 replica.cpp:320] Persisted replica status to VOTING
I0316 14:28:51.340672  1270 recover.cpp:578] Successfully joined the Paxos group
I0316 14:28:51.340827  1270 recover.cpp:462] Recover process terminated
I0316 14:28:51.341684  1270 master.cpp:1806] The newly elected leader is 
master@172.17.0.3:45919 with id c7653f60-33e9-4406-9f62-dc74c906bf83
I0316 14:28:51.341717  1270 master.cpp:1819] Elected as the leading master!
I0316 14:28:51.341740  1270 master.cpp:1508] Recovering from registrar
I0316 14:28:51.341954  1263 registrar.cpp:307] Recovering registrar
I0316 14:28:51.342499  1273 log.cpp:659] Attempting to start the writer
I0316 14:28:51.343616  1266 replica.cpp:493] Replica received imp

[jira] [Updated] (MESOS-4963) Compile error with GCC 6

2016-03-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4963:
---
Description: 
{noformat}
$ head config.log
[...]
/mesos-2/configure --enable-optimize --disable-python CC=ccache 
/home/vagrant/local/gcc/bin/gcc CXX=ccache /home/vagrant/local/gcc/bin/g++
$ ~/local/gcc/bin/g++ --version
g++ (GCC) 6.0.0 20160227 (experimental)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ make V=0
make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
  CXX  appc/libmesos_no_3rdparty_la-spec.lo
In file included from 
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
 from 
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
 from /mesos-2/src/appc/spec.cpp:17:
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
In instantiation of ‘int os::execlp(const char*, T ...) [with T = {const char*, 
const char*, const char*, char*}]’:
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/fork.hpp:371:52:
   required from here
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp:151:18:
 error: missing sentinel in function call [-Werror=format=]
   return ::execlp(file, t...);
  ^~~~
cc1plus: all warnings being treated as errors
Makefile:5584: recipe for target 'appc/libmesos_no_3rdparty_la-spec.lo' failed
{noformat}

I'll verify this with a more recent GCC6 snapshot, but assuming it repros, I 
think we have a few options:

* Have {{os::execlp}} *not* specify a NULL sentinel, and instead have the 
implementation of {{os::execlp}} always pass {{static_cast(NULL)}} as the 
last argument to the {{execlp}} call.
* Disable the GCC warning via a pragma or similar means.

  was:
{noformat}
$ ~/local/gcc/bin/g++ --version
g++ (GCC) 6.0.0 20160227 (experimental)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
$ make V=0
make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
  CXX  appc/libmesos_no_3rdparty_la-spec.lo
In file included from 
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
 from 
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
 from /mesos-2/src/appc/spec.cpp:17:
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
In instantiation of ‘int os::execlp(const char*, T ...) [with T = {const char*, 
const char*, const char*, char*}]’:
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/fork.hpp:371:52:
   required from here
/mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp:151:18:
 error: missing sentinel in function call [-Werror=format=]
   return ::execlp(file, t...);
  ^~~~
cc1plus: all warnings being treated as errors
Makefile:5584: recipe for target 'appc/libmesos_no_3rdparty_la-spec.lo' failed
{noformat}

I'll verify this with a more recent GCC6 snapshot, but assuming it repros, I 
think we have a few options:

* Have {{os::execlp}} *not* specify a NULL sentinel, and instead have the 
implementation of {{os::execlp}} always pass {{static_cast(NULL)}} as the 
last argument to the {{execlp}} call.
* Disable the GCC warning via a pragma or similar means.


> Compile error with GCC 6
> 
>
> Key: MESOS-4963
> URL: https://issues.apache.org/jira/browse/MESOS-4963
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Neil Conway
>  Labels: mesosphere
>
> {noformat}
> $ head config.log
> [...]
> /mesos-2/configure --enable-optimize --disable-python CC=ccache 
> /home/vagrant/local/gcc/bin/gcc CXX=ccache /home/vagrant/local/gcc/bin/g++
> $ ~/local/gcc/bin/g++ --version
> g++ (GCC) 6.0.0 20160227 (experimental)
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $ make V=0
> make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
>   CXX  appc/libmesos_no_3rdparty_la-spec.lo
> In file included from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
>  from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
>  from /mesos-2/src/appc/spec.cpp:17:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
> In instantiation of ‘int os::execlp(const char*, 

[jira] [Comment Edited] (MESOS-4033) Add a commit hook for non-ascii charachters

2016-03-19 Thread Yong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201573#comment-15201573
 ] 

Yong Tang edited comment on MESOS-4033 at 3/18/16 2:50 PM:
---

Hi [~alexr] [~bernd-mesos], I added a review request to address this issue:
https://reviews.apache.org/r/45033/

Would appreciate any feedbacks or comments.


was (Author: yongtang):
Hi [~alexr] [~bernd-mesos], I added a review request to address this issue:
https://reviews.apache.org/r/45033/

Would appreciate for any feedbacks or comments.

> Add a commit hook for non-ascii charachters
> ---
>
> Key: MESOS-4033
> URL: https://issues.apache.org/jira/browse/MESOS-4033
> Project: Mesos
>  Issue Type: Task
>Reporter: Alexander Rukletsov
>Assignee: Yong Tang
>Priority: Minor
>  Labels: mesosphere
>
> Non-ascii characters invisible in some editors may sneak into the codebase 
> (see e.g. https://reviews.apache.org/r/40799/). To avoid this, a pre-commit 
> hook can be added.
> Quick searching suggested a simple perl script: 
> https://superuser.com/questions/417305/how-can-i-identify-non-ascii-characters-from-the-shell



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-4963) Compile error with GCC 6

2016-03-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway reassigned MESOS-4963:
--

Assignee: Neil Conway

> Compile error with GCC 6
> 
>
> Key: MESOS-4963
> URL: https://issues.apache.org/jira/browse/MESOS-4963
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> {noformat}
> $ head config.log
> [...]
> /mesos-2/configure --enable-optimize --disable-python CC=ccache 
> /home/vagrant/local/gcc/bin/gcc CXX=ccache /home/vagrant/local/gcc/bin/g++
> $ ~/local/gcc/bin/g++ --version
> g++ (GCC) 6.0.0 20160227 (experimental)
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $ make V=0
> make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
>   CXX  appc/libmesos_no_3rdparty_la-spec.lo
> In file included from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
>  from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
>  from /mesos-2/src/appc/spec.cpp:17:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
> In instantiation of ‘int os::execlp(const char*, T ...) [with T = {const 
> char*, const char*, const char*, char*}]’:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/fork.hpp:371:52:
>required from here
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp:151:18:
>  error: missing sentinel in function call [-Werror=format=]
>return ::execlp(file, t...);
>   ^~~~
> cc1plus: all warnings being treated as errors
> Makefile:5584: recipe for target 'appc/libmesos_no_3rdparty_la-spec.lo' failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4986) Remove full used agent after stage 1

2016-03-19 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-4986:

Summary: Remove full used agent after stage 1  (was: Remove full used agent 
in stage 2)

> Remove full used agent after stage 1
> 
>
> Key: MESOS-4986
> URL: https://issues.apache.org/jira/browse/MESOS-4986
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Klaus Ma
>
> If the resources in an Agent is not {{allocatable}}, it's not necessary to 
> handle it in stage 2 for each framework. It will improve the performance by 
> avoiding unnecessary loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4986) Remove full used agent after stage 1

2016-03-19 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-4986:

Description: If the resources in an Agent is not {{allocatable}} after 
stage 1, it's not necessary to handle it in stage 2 for each framework. It will 
improve the performance by avoiding unnecessary loop.  (was: If the resources 
in an Agent is not {{allocatable}}, it's not necessary to handle it in stage 2 
for each framework. It will improve the performance by avoiding unnecessary 
loop.)

> Remove full used agent after stage 1
> 
>
> Key: MESOS-4986
> URL: https://issues.apache.org/jira/browse/MESOS-4986
> Project: Mesos
>  Issue Type: Bug
>  Components: allocation
>Reporter: Klaus Ma
>
> If the resources in an Agent is not {{allocatable}} after stage 1, it's not 
> necessary to handle it in stage 2 for each framework. It will improve the 
> performance by avoiding unnecessary loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4983) Segfault in ProcessTest.Spawn with GCC 6

2016-03-19 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15202378#comment-15202378
 ] 

Neil Conway commented on MESOS-4983:


Well yeah, I'm not suggesting we spend a ton of time looking into it :) I just 
opened the issue for tracking. If someone wants to determine if its a GCC bug 
or Mesos bug, that would be great. Otherwise we can wait until GCC 6 is 
released and then see if it repros at that point.

> Segfault in ProcessTest.Spawn with GCC 6
> 
>
> Key: MESOS-4983
> URL: https://issues.apache.org/jira/browse/MESOS-4983
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, tests
>Reporter: Neil Conway
>  Labels: mesosphere
>
> {{ProcessTest.Spawn}} fails deterministically for me with GCC 6 and 
> {{--enable-optimize}}. Recent Arch Linux, GCC "6.0.0 20160227".
> {noformat}
> [ RUN  ] ProcessTest.Spawn
> *** Aborted at 145817 (unix time) try "date -d @145817" if you are 
> using GNU date ***
> PC: @   0x522926 SpawnProcess::initialize()
> *** SIGSEGV (@0x0) received by PID 11359 (TID 0x7faa6075f700) from PID 0; 
> stack trace: ***
> @ 0x7faa670dbe80 (unknown)
> @   0x522926 SpawnProcess::initialize()
> @   0x646fa6 process::ProcessManager::resume()
> @   0x6471ff 
> _ZNSt6thread11_State_implISt12_Bind_simpleIFSt5_BindIFZN7process14ProcessManager12init_threadsEvEUlRKSt6atomicIbEE_St17reference_wrapperIS7_EEEvEEE6_M_runEv
> @ 0x7faa6764a812 execute_native_thread_routine
> @ 0x7faa670d2424 start_thread
> @ 0x7faa65b04cbd __clone
> @0x0 (unknown)
> Makefile:1748: recipe for target 'check-local' failed
> make[5]: *** [check-local] Segmentation fault (core dumped)
> {noformat}
> Backtrace:
> {noformat}
> Program terminated with signal SIGSEGV, Segmentation fault.
> #0  testing::internal::ActionResultHolder::GetValueAndDelete (this=0x0) 
> at 3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1373
> 1373void GetValueAndDelete() const { delete this; }
> [Current thread is 1 (Thread 0x7faa6075f700 (LWP 11365))]
> (gdb) bt
> #0  testing::internal::ActionResultHolder::GetValueAndDelete (this=0x0) 
> at 3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1373
> #1  testing::internal::FunctionMockerBase::InvokeWith(std::tuple<> 
> const&) (args=empty std::tuple, this=0x712a7c88) at 
> 3rdparty/gmock-1.7.0/include/gmock/gmock-spec-builders.h:1530
> #2  testing::internal::FunctionMocker::Invoke() 
> (this=0x712a7c88) at 
> 3rdparty/gmock-1.7.0/include/gmock/gmock-generated-function-mockers.h:76
> #3  SpawnProcess::initialize (this=0x712a7c80) at 
> /mesos-2/3rdparty/libprocess/src/tests/process_tests.cpp:113
> #4  0x00646fa6 in process::ProcessManager::resume (this=0x25a2b60, 
> process=0x712a7d38) at /mesos-2/3rdparty/libprocess/src/process.cpp:2504
> #5  0x006471ff in process::ProcessManager:: atomic_bool&)>::operator() (__closure=, joining=...) at 
> /mesos-2/3rdparty/libprocess/src/process.cpp:2218
> #6  std::_Bind atomic_bool&)>(std::reference_wrapper 
> >)>::__call (__args=, this=) at 
> /home/vagrant/local/gcc/include/c++/6.0.0/functional:943
> #7  std::_Bind atomic_bool&)>(std::reference_wrapper 
> >)>::operator()<> (this=) at 
> /home/vagrant/local/gcc/include/c++/6.0.0/functional:1002
> #8  
> std::_Bind_simple  atomic_bool&)>(std::reference_wrapper 
> >)>()>::_M_invoke<> (this=) at 
> /home/vagrant/local/gcc/include/c++/6.0.0/functional:1400
> #9  
> std::_Bind_simple  atomic_bool&)>(std::reference_wrapper 
> >)>()>::operator() (this=) at 
> /home/vagrant/local/gcc/include/c++/6.0.0/functional:1389
> #10 
> std::thread::_State_impl  atomic_bool&)>(std::reference_wrapper >)>()> 
> >::_M_run(void) (this=) at 
> /home/vagrant/local/gcc/include/c++/6.0.0/thread:196
> #11 0x7faa6764a812 in std::(anonymous 
> namespace)::execute_native_thread_routine (__p=0x25a3bf0) at 
> ../../../../../gcc-trunk/libstdc++-v3/src/c++11/thread.cc:83
> #12 0x7faa670d2424 in start_thread () from /usr/lib/libpthread.so.0
> #13 0x7faa65b04cbd in clone () from /usr/lib/libc.so.6
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4967) Oversubscription for reservation

2016-03-19 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-4967:

Component/s: framework
 allocation

> Oversubscription for reservation
> 
>
> Key: MESOS-4967
> URL: https://issues.apache.org/jira/browse/MESOS-4967
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation, framework, master
>Reporter: Klaus Ma
>  Labels: IBM, mesosphere
>
> Reserved resources allow frameworks and cluster operators to ensure 
> sufficient resources are available when needed.  Reservations are usually 
> made to guarantee there are enough resources under peak loads. Often times, 
> reserved resources are not actually allocated; in other words, the frameworks 
> do not use those resources and they sit reserved, but idle.
> This underutilization is either an opportunity cost or a direct cost, 
> particularly to the cluster operator.  Reserved but unallocated resources 
> held by a Lender Framework could be optimistically offered to other 
> frameworks, which we refer to as Tenant Frameworks.  When the resources are 
> requested back by the Lender Framework, some of the Tenant Framework’s tasks 
> are evicted and the original resource offer guarantee is preserved.
> The first step is to identify when resources are reserved, but not allocated. 
>  We then offer these reserved resources to other frameworks, but mark these 
> offered resources as revocable resources.  This allows Tenant Frameworks to 
> use these resources temporarily in a 'best-effort' fashion, knowing that they 
> could be revoked or reclaimed at any time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4033) Add a commit hook for non-ascii charachters

2016-03-19 Thread Yong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201573#comment-15201573
 ] 

Yong Tang commented on MESOS-4033:
--

Hi [~alexr] [~bernd-mesos], I added a review request to address this issue:
https://reviews.apache.org/r/45033/

Would appreciate for any feedbacks or comments.

> Add a commit hook for non-ascii charachters
> ---
>
> Key: MESOS-4033
> URL: https://issues.apache.org/jira/browse/MESOS-4033
> Project: Mesos
>  Issue Type: Task
>Reporter: Alexander Rukletsov
>Assignee: Yong Tang
>Priority: Minor
>  Labels: mesosphere
>
> Non-ascii characters invisible in some editors may sneak into the codebase 
> (see e.g. https://reviews.apache.org/r/40799/). To avoid this, a pre-commit 
> hook can be added.
> Quick searching suggested a simple perl script: 
> https://superuser.com/questions/417305/how-can-i-identify-non-ascii-characters-from-the-shell



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-1607) Introduce optimistic offers.

2016-03-19 Thread Klaus Ma (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1607?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199503#comment-15199503
 ] 

Klaus Ma edited comment on MESOS-1607 at 3/17/16 1:19 PM:
--

I update this JIRA to align with its description. The feature we were working 
on is {{Oversubscription for Reservation}} which is moved to MESOS-4967.

Thanks
Klaus


was (Author: klaus1982):
I update this JIRA to align with its description. The feature we're working on 
is Oversubscription for Reservation which is moved to MESOS-4967.

Thanks
Klaus

> Introduce optimistic offers.
> 
>
> Key: MESOS-1607
> URL: https://issues.apache.org/jira/browse/MESOS-1607
> Project: Mesos
>  Issue Type: Epic
>  Components: allocation, framework, master
>Reporter: Benjamin Hindman
>Assignee: Artem Harutyunyan
>  Labels: mesosphere
> Attachments: optimisitic-offers.pdf
>
>
> *Background*
> The current implementation of resource offers only enable a single framework 
> scheduler to make scheduling decisions for some available resources at a 
> time. In some circumstances, this is good, i.e., when we don't want other 
> framework schedulers to have access to some resources. However, in other 
> circumstances, there are advantages to letting multiple framework schedulers 
> attempt to make scheduling decisions for the _same_ allocation of resources 
> in parallel.
> If you think about this from a "concurrency control" perspective, the current 
> implementation of resource offers is _pessimistic_, the resources contained 
> within an offer are _locked_ until the framework scheduler that they were 
> offered to launches tasks with them or declines them. In addition to making 
> pessimistic offers we'd like to give out _optimistic_ offers, where the same 
> resources are offered to multiple framework schedulers at the same time, and 
> framework schedulers "compete" for those resources on a 
> first-come-first-serve basis (i.e., the first to launch a task "wins"). We've 
> always reserved the right to rescind resource offers using the 'rescind' 
> primitive in the API, and a framework scheduler should be prepared to launch 
> a task and have those tasks go lost because another framework already started 
> to use those resources.
> *Feature*
> We plan to take a step towards optimistic offers, by introducing primitives 
> that allow resources to be offered to multiple frameworks at once.  At first, 
> we will use these primitives to optimistically allocate resources that are 
> reserved for a particular framework/role but have not been allocated by that 
> framework/role.  
> The work with optimistic offers will closely resemble the existing 
> oversubscription feature.  Optimistically offered resources are likely to be 
> considered "revocable resources" (the concept that using resources not 
> reserved for you means you might get those resources revoked).  In effect, we 
> can may create something like a "spot" market for unused resources, driving 
> up utilization by letting frameworks that are willing to use revocable 
> resources run tasks.
> *Future Work*
> This ticket tracks the introduction of some aspects of optimistic offers.  
> Taken to the limit, one could imagine always making optimistic resource 
> offers. This bears a striking resemblance with the Google Omega model (an 
> isomorphism even). However, being able to configure what resources should be 
> allocated optimistically and what resources should be allocated 
> pessimistically gives even more control to a datacenter/cluster operator that 
> might want to, for example, never let multiple frameworks (roles) compete for 
> some set of resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Dan Osborne (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198135#comment-15198135
 ] 

Dan Osborne commented on MESOS-4823:


Thank you for providing the example use case. Can you explain, on a technical 
level, what condition you are planning that will trigger creation of these 
ip-tables rules?

I'm concerned that the capability you're trying to provide makes a lot of 
assumptions about both the mesos cluster and the CNI network's configurations, 
and to what degree both are accessible by the public network.

I believe that if this behavior goes in, to some degree it should be opt-in or 
opt-out, as not all clusters nor CNI network's would want such a behavior. 

Some counter use cases - 
1. if the CNI network _is_ assigning publicly accessible addresses, the port 
mapping becomes a redundant.

2. if they are using a load balancer, they would not need port forwarding as 
the load balancer will forward public requests onto the private CNI network.

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4977) Sometime Cmd":["-c","echo 'No such file or directory'] in task.

2016-03-19 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201802#comment-15201802
 ] 

haosdent commented on MESOS-4977:
-

Your error log related to this part in code.
{codetitle=slave.cpp|borderStyle=solid}
  if (task.has_command()) {
  ...
  executor.mutable_command()->set_value(
  "echo '" +
  (path.isError() ? path.error() : "No such file or directory") +
  "'; exit 1");
  ...
  }
{code}

The wired thing is it go into this part while you don't have command in task. 
If you don't use marathon, for example, use {{mesos-execute}} to simulate this 
case, would it still happens?

> Sometime Cmd":["-c","echo 'No such file or directory'] in task.
> ---
>
> Key: MESOS-4977
> URL: https://issues.apache.org/jira/browse/MESOS-4977
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.2
> Environment: 189 mesos slaves on Ubuntu 14.04.3 LTS
>Reporter: Sergey Galkin
>
> mesos - 0.27.0
> marathon - 0.15.2
> I am trying to launch 1 simple docker application with nginx with 500 
> instances on cluster with 189 HW nodes through Marathon
> {code}
> ID /1f532267a08494e3081c1acb42d273b7
> Command Unspecified
> Constraints Unspecified
> Dependencies Unspecified
> Labels Unspecified
> Resource Roles Unspecified
> Container
> {
>   "type": "DOCKER",
>   "volumes": [],
>   "docker": {
> "image": "nginx",
> "network": "BRIDGE",
> "portMappings": [
>   {
> "containerPort": 80,
> "hostPort": 0,
> "servicePort": 1,
> "protocol": "tcp"
>   }
> ],
> "privileged": false,
> "parameters": [],
> "forcePullImage": false
>   }
> }
> CPUs 1
> Environment Unspecified
> Executor Unspecified
> Health Checks 
> [
>   {
> "path": "/",
> "protocol": "HTTP",
> "portIndex": 0,
> "gracePeriodSeconds": 300,
> "intervalSeconds": 60,
> "timeoutSeconds": 20,
> "maxConsecutiveFailures": 3,
> "ignoreHttp1xx": false
>   }
> ]
> Instances 500
> IP Address Unspecified
> Memory 256 MiB
> Disk Space 50 MiB
> Ports 1
> Backoff Factor 1.15
> Backoff 1 seconds
> Max Launch Delay 3600 seconds
> URIs Unspecified
> User Unspecified
> {code}
> Deployment stopped on Delayed, only about 360-370 of 500 instances are 
> successful. In the stdout in the failed mesos tasks I see "No such file or 
> directory"
> As I see in /var/log/upstarе/docker.log with enabled debug mesos sometimes 
> try to start containers with strange Cmd ("Cmd":["-c","echo 'No such file or 
> directory'; exit 1"]) and this task failed. Sometime everything is ok 
> "Cmd":null and task in RUNNING state
> Part of the log available in http://paste.openstack.org/show/491122/
> I successfully started 700 nginx with docker applications with 10 instances 
> simultaneously in this cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Dan Osborne (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15197757#comment-15197757
 ] 

Dan Osborne commented on MESOS-4823:


What is the use case for requiring port forwarding? 

I don't believe this feature request should be implemented, as I don't believe 
that port forwarding fits into the larger CNI story.

CNI defines a container's network as "a group of entities that are uniquely 
addressable". In general, CNI plugins do not make use of port forwarding 
because addresses in their network are *uniquely* addressable. 

The port which a container is running services on should be accessible on the 
IP address the CNI network assigned to it. I believe that forwarding a port on 
the agent's IP to a port on the CNI network's IP is fundamentally wrong, as it 
suggests that the container's CNI IP is not uniquely addressable.

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish ports that micro-services are listening on, 
> to the outside world. When containers are running on bridged (or ptp) 
> networking this can be achieved by installing port forwarding rules on the 
> agent (using iptables). This can be done in the `network/cni` isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4735) CommandInfo.URI should allow specifying target filename

2016-03-19 Thread Michael Browning (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4735?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Browning updated MESOS-4735:

Shepherd: Vinod Kone

> CommandInfo.URI should allow specifying target filename
> ---
>
> Key: MESOS-4735
> URL: https://issues.apache.org/jira/browse/MESOS-4735
> Project: Mesos
>  Issue Type: Improvement
>  Components: fetcher
>Reporter: Erik Weathers
>Assignee: Michael Browning
>Priority: Minor
>
> The {{CommandInfo.URI}} message should allow explicitly choosing the 
> downloaded file's name, to better mimic functionality present in tools like 
> {{wget}} and {{curl}}.
> This relates to issues when the {{CommandInfo.URI}} is pointing to a URL that 
> has query parameters at the end of the path, resulting in the downloaded 
> filename having those elements.  This also prevents extracting of such files, 
> since the extraction logic is simply looking at the file's suffix. See 
> MESOS-3367, MESOS-1686, and MESOS-1509 for more info.  If this issue was 
> fixed, then I could workaround the other issues not being fixed by modifying 
> my framework's scheduler to set the target filename.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4963) Incorrect CXXFLAGS with GCC 6

2016-03-19 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201653#comment-15201653
 ] 

haosdent commented on MESOS-4963:
-

I think you forgot update python build flag. When build with gcc 4.8, it don't 
enable {{c++11}} default. So would failed when build mesos with gcc 4.8.

> Incorrect CXXFLAGS with GCC 6
> -
>
> Key: MESOS-4963
> URL: https://issues.apache.org/jira/browse/MESOS-4963
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> {noformat}
> $ head config.log
> [...]
> /mesos-2/configure --enable-optimize --disable-python CC=ccache 
> /home/vagrant/local/gcc/bin/gcc CXX=ccache /home/vagrant/local/gcc/bin/g++
> $ ~/local/gcc/bin/g++ --version
> g++ (GCC) 6.0.0 20160227 (experimental)
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $ make V=0
> make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
>   CXX  appc/libmesos_no_3rdparty_la-spec.lo
> In file included from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
>  from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
>  from /mesos-2/src/appc/spec.cpp:17:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
> In instantiation of ‘int os::execlp(const char*, T ...) [with T = {const 
> char*, const char*, const char*, char*}]’:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/fork.hpp:371:52:
>required from here
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp:151:18:
>  error: missing sentinel in function call [-Werror=format=]
>return ::execlp(file, t...);
>   ^~~~
> cc1plus: all warnings being treated as errors
> Makefile:5584: recipe for target 'appc/libmesos_no_3rdparty_la-spec.lo' failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4971) Add unit tests for MOUNT persistent volumes

2016-03-19 Thread Neil Conway (JIRA)
Neil Conway created MESOS-4971:
--

 Summary: Add unit tests for MOUNT persistent volumes
 Key: MESOS-4971
 URL: https://issues.apache.org/jira/browse/MESOS-4971
 Project: Mesos
  Issue Type: Task
  Components: tests
Reporter: Neil Conway


We currently have unit tests for root and {{PATH}} disk types, but not 
{{MOUNT}} disks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4986) Remove full used agent in stage 2

2016-03-19 Thread Klaus Ma (JIRA)
Klaus Ma created MESOS-4986:
---

 Summary: Remove full used agent in stage 2
 Key: MESOS-4986
 URL: https://issues.apache.org/jira/browse/MESOS-4986
 Project: Mesos
  Issue Type: Bug
  Components: allocation
Reporter: Klaus Ma


If the resources in an Agent is not {{allocatable}}, it's not necessary to 
handle it in stage 2 for each framework. It will improve the performance by 
avoiding unnecessary loop.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4932) Propose Design for Authorization based filtering for endpoints.

2016-03-19 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201026#comment-15201026
 ] 

Adam B commented on MESOS-4932:
---

I'm no longer interested in `_WITH_OWNER`, because of the aforementioned 
difficulty of sharing only some of a user's tasks. Instead, we can attach a 
task-group to each task, which becomes tied to the hierarchical role once we 
have those. See my comments in MESOS-4772
I don't know if that means we have a _WITH_ROLE and a _WITH_GROUP (or 
_WITH_SPACE) for now, and then merge them into one in the future; or if we can 
overload _WITH_ROLE to incorporate task groups within a framework.
[~klueska] suggested we could just have a frameworkInfo.role for the framework 
group hierarchy and taskInfo.role for the task group hierarchy, but then I 
don't know what you'd call the combined (framework.role:task.role) that you'd 
do authorization on.

> Propose Design for Authorization based filtering for endpoints.
> ---
>
> Key: MESOS-4932
> URL: https://issues.apache.org/jira/browse/MESOS-4932
> Project: Mesos
>  Issue Type: Task
>  Components: security
>Reporter: Joerg Schad
>Assignee: Joerg Schad
>  Labels: authorization, mesosphere, security
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4964) curl based docker fetcher fails to decode chunked encoding

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4964:
--
Affects Version/s: 0.28.0

> curl based docker fetcher fails to decode chunked encoding
> --
>
> Key: MESOS-4964
> URL: https://issues.apache.org/jira/browse/MESOS-4964
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 0.28.0
>Reporter: James Peach
>Assignee: Jie Yu
>  Labels: mesosphere
> Fix For: 0.29.0
>
>
> If the curl-base fetcher gets a HTTP response that is chunked, the HTTP 
> decode fails because the response says it is chunked, but curl is dechunking 
> the body to stdout.
> {code}
> E0316 15:23:31.124482 13299 slave.cpp:3773] Container 
> 'fa06a5ee-637e-480c-b602-59705b707d85' for executor 'jpeach.10489' of 
> framework 96d1191b-cdf0-40f6-8840-e4d4d92a9345-0010 failed to start: Collect 
> failed: Failed to decode HTTP responses: Decoding failed
> HTTP/1.1 400 Bad Request
> Server: nginx/1.9.4
> Date: Wed, 16 Mar 2016 22:23:30 GMT
> Content-Type: application/json
> Transfer-Encoding: chunked
> Connection: keep-alive
> X-Artifactory-Id: ae6c9bffd47ec19a:-61ef0a68:1537a605a05:-8000
> {
>   "errors" : [ {
> "status" : 400,
> "message" : "Unsupported docker v2 repository request for 
> 'docker-registry'"
>   } ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4977) Sometime Cmd":["-c","echo 'No such file or directory'] in task.

2016-03-19 Thread SERGEY GALKIN (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15201767#comment-15201767
 ] 

SERGEY GALKIN commented on MESOS-4977:
--

Mesos Slaves HW

HP ProLiant DL380 Gen9,
CPU - 2 x Intel(R) Xeon(R) CPU E5-2680 v3 @2.50GHz (48 cores (with 
hyperthreading))
RAM - 264G,
Storage - 3.0T on RAID on HP Smart Array P840 Controller,
HDD - 12 x HP EH0600JDYTL
Network - 2 x Intel Corporation Ethernet 10G2P 
X710,


> Sometime Cmd":["-c","echo 'No such file or directory'] in task.
> ---
>
> Key: MESOS-4977
> URL: https://issues.apache.org/jira/browse/MESOS-4977
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.27.2
> Environment: 189 mesos slaves on Ubuntu 14.04.3 LTS
>Reporter: SERGEY GALKIN
>
> mesos - 0.27.0
> marathon - 0.15.2
> I am trying to launch 1 simple docker application with nginx with 500 
> instances on cluster with 189 HW nodes through Marathon
> {code}
> ID /1f532267a08494e3081c1acb42d273b7
> Command Unspecified
> Constraints Unspecified
> Dependencies Unspecified
> Labels Unspecified
> Resource Roles Unspecified
> Container
> {
>   "type": "DOCKER",
>   "volumes": [],
>   "docker": {
> "image": "nginx",
> "network": "BRIDGE",
> "portMappings": [
>   {
> "containerPort": 80,
> "hostPort": 0,
> "servicePort": 1,
> "protocol": "tcp"
>   }
> ],
> "privileged": false,
> "parameters": [],
> "forcePullImage": false
>   }
> }
> CPUs 1
> Environment Unspecified
> Executor Unspecified
> Health Checks 
> [
>   {
> "path": "/",
> "protocol": "HTTP",
> "portIndex": 0,
> "gracePeriodSeconds": 300,
> "intervalSeconds": 60,
> "timeoutSeconds": 20,
> "maxConsecutiveFailures": 3,
> "ignoreHttp1xx": false
>   }
> ]
> Instances 500
> IP Address Unspecified
> Memory 256 MiB
> Disk Space 50 MiB
> Ports 1
> Backoff Factor 1.15
> Backoff 1 seconds
> Max Launch Delay 3600 seconds
> URIs Unspecified
> User Unspecified
> {code}
> Deployment stopped on Delayed, only about 360-370 of 500 instances are 
> successful. In the stdout in the failed mesos tasks I see "No such file or 
> directory"
> As I see in /var/log/upstarе/docker.log with enabled debug mesos sometimes 
> try to start containers with strange Cmd ("Cmd":["-c","echo 'No such file or 
> directory'; exit 1"]) and this task failed. Sometime everything is ok 
> "Cmd":null and task in RUNNING state
> Part of the log available in http://paste.openstack.org/show/491122/
> I successfully started 700 nginx with docker applications with 10 instances 
> simultaneously in this cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4958) Implement clang-tidy check for log message style

2016-03-19 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-4958:

Description: 
In most cases mesos log messages should not be explicitly terminated with a 
period.

We should add a check that message string passed to e.g., {{LOG}}, 
{{std::cout}} and {{std::cerr}}, or {{CHECK*}} do not end in periods.

  was:
In most cases mesos log messages should not be explicitly terminated with a 
period.

We should add a check that message string passed to e.g., `LOG`, `std::cout` 
and `std::cerr`, or `CHECK*` do not end in periods.


> Implement clang-tidy check for log message style
> 
>
> Key: MESOS-4958
> URL: https://issues.apache.org/jira/browse/MESOS-4958
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Benjamin Bannier
>
> In most cases mesos log messages should not be explicitly terminated with a 
> period.
> We should add a check that message string passed to e.g., {{LOG}}, 
> {{std::cout}} and {{std::cerr}}, or {{CHECK*}} do not end in periods.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4979) os::rmdir does not handle special files (e.g., device, socket).

2016-03-19 Thread Jie Yu (JIRA)
Jie Yu created MESOS-4979:
-

 Summary: os::rmdir does not handle special files (e.g., device, 
socket).
 Key: MESOS-4979
 URL: https://issues.apache.org/jira/browse/MESOS-4979
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.27.2, 0.27.1, 0.27.0, 0.26.0, 0.25.0, 0.24.0, 0.23.0, 
0.22.0, 0.21.0, 0.20.0, 0.19.0
Reporter: Jie Yu
Assignee: Jojy Varghese
Priority: Blocker
 Fix For: 0.28.0


Stout os::rmdir does not handle special files like device files or socket 
files. This could cause failures when GC sandboxes.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4984) MasterTest.SlavesEndpointTwoSlaves is flaky

2016-03-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4984?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4984:
---
Labels: flaky-test mesosphere tech-debt  (was: flaky-test mesosphere)

> MasterTest.SlavesEndpointTwoSlaves is flaky
> ---
>
> Key: MESOS-4984
> URL: https://issues.apache.org/jira/browse/MESOS-4984
> Project: Mesos
>  Issue Type: Bug
>  Components: tests
>Reporter: Neil Conway
>  Labels: flaky-test, mesosphere, tech-debt
> Attachments: slaves_endpoint_flaky_4984_verbose_log.txt
>
>
> Observed on Arch Linux with GCC 6, running in a virtualbox VM:
> [ RUN  ] MasterTest.SlavesEndpointTwoSlaves
> /mesos-2/src/tests/master_tests.cpp:1710: Failure
> Value of: array.get().values.size()
>   Actual: 1
> Expected: 2u
> Which is: 2
> [  FAILED  ] MasterTest.SlavesEndpointTwoSlaves (86 ms)
> Seems to fail non-deterministically, perhaps more often when there is 
> concurrent CPU load on the machine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4070) numify() handles negative numbers inconsistently.

2016-03-19 Thread Yong Tang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200776#comment-15200776
 ] 

Yong Tang commented on MESOS-4070:
--

[~jieyu] I just submitted a review request for this issue:
https://reviews.apache.org/r/45011/
Would you mind shepherd this?


> numify() handles negative numbers inconsistently.
> -
>
> Key: MESOS-4070
> URL: https://issues.apache.org/jira/browse/MESOS-4070
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Jie Yu
>Assignee: Yong Tang
>  Labels: tech-debt
>
> As pointed by [~neilc] in this review:
> https://reviews.apache.org/r/40988
> {noformat}
> Try num2 = numify("-10");
> EXPECT_SOME_EQ(-10, num2);
> // TODO(neilc): This is inconsistent with the handling of non-hex numbers.
> EXPECT_ERROR(numify("-0x10"));
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4972) Implement `os::rename`

2016-03-19 Thread Alex Clemmer (JIRA)
Alex Clemmer created MESOS-4972:
---

 Summary: Implement `os::rename`
 Key: MESOS-4972
 URL: https://issues.apache.org/jira/browse/MESOS-4972
 Project: Mesos
  Issue Type: Bug
  Components: stout
Reporter: Alex Clemmer
Assignee: Alex Clemmer






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4878) Task stuck in TASK_STAGING when docker fetcher failed to fetch the image

2016-03-19 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-4878:
--
Shepherd: Jie Yu

> Task stuck in TASK_STAGING when docker fetcher failed to fetch the image
> 
>
> Key: MESOS-4878
> URL: https://issues.apache.org/jira/browse/MESOS-4878
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization, docker
>Affects Versions: 0.27.0, 0.27.1
>Reporter: Shuai Lin
>Assignee: Shuai Lin
>
> When a task is launched with the mesos containerizer and a docker image, if 
> the docker fetcher failed to pull the image, no more task updates are sent to 
> the scheduler.
> {code}
> I0306 17:28:57.627169 17647 registry_puller.cpp:194] Pulling image 
> 'alpine:latest' from 
> 'docker-manifest://registry-1.docker.io:443alpine?latest#https' to 
> '/tmp/mesos-test/store/docker/staging/V2dqJv'
> E0306 17:29:00.749889 17651 slave.cpp:3773] Container 
> '6b98026b-a58d-434c-9432-b517012edc35' for executor 'just-a-test' of 
> framework a4ff93ba-2141-48e2-92a9-7354e4028282- failed to start: Collect 
> failed: Unexpected HTTP response '401 Unauthorized' when trying to get the 
> manifest
> I0306 17:29:00.751579 17646 containerizer.cpp:1392] Destroying container 
> '6b98026b-a58d-434c-9432-b517012edc35'
> I0306 17:29:00.752188 17646 containerizer.cpp:1395] Waiting for the isolators 
> to complete preparing before destroying the container
> I0306 17:29:57.618649 17649 slave.cpp:4322] Terminating executor 
> ''just-a-test' of framework a4ff93ba-2141-48e2-92a9-73
> {code}
> Scheduler logs:
> {code}
> sudo ./build/src/mesos-execute --docker_image=alpine:latest 
> --containerizer=mesos --name=just-a-test --command="sleep 1000" 
> --master=33.33.33.33:5050
> WARNING: Logging before InitGoogleLogging() is written to STDERR
> W0306 17:28:57.491081 17740 sched.cpp:1642] 
> **
> Scheduler driver bound to loopback interface! Cannot communicate with remote 
> master(s). You might want to set 'LIBPROCESS_IP' environment variable to use 
> a routable IP address.
> **
> I0306 17:28:57.498028 17740 sched.cpp:222] Version: 0.29.0
> I0306 17:28:57.533071 17761 sched.cpp:326] New master detected at 
> master@33.33.33.33:5050
> I0306 17:28:57.536761 17761 sched.cpp:336] No credentials provided. 
> Attempting to register without authentication
> I0306 17:28:57.557729 17759 sched.cpp:703] Framework registered with 
> a4ff93ba-2141-48e2-92a9-7354e4028282-
> Framework registered with a4ff93ba-2141-48e2-92a9-7354e4028282-
> task just-a-test submitted to slave a4ff93ba-2141-48e2-92a9-7354e4028282-S0
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-4823) Implement port forwarding in `network/cni` isolator

2016-03-19 Thread Avinash Sridharan (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4823?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198340#comment-15198340
 ] 

Avinash Sridharan commented on MESOS-4823:
--

You are right, we don't want to do this every container that has `EXPOSED` 
ports (I am taking docker images as an example here). This should be an opt-in 
from frameworks launching the container. The idea was to introduce fields in 
the `NetworkInfo` protobuf that will allow frameworks to set two pieces of 
information:
a) A boolean specifying if the framework wants the containers ports to be 
exposed.
b) If (a) is true, a range of ports to select the port mapping from, or 
container-port:host-port mapping. For the former case we would need the set of 
ports being exposed to be specified in the `ImageManifest`. 

For starters we are thinking about taking docker images as an example. Since, 
docker images have the `EXPOSE` directive. 


Comments are welcome.

> Implement port forwarding in `network/cni` isolator
> ---
>
> Key: MESOS-4823
> URL: https://issues.apache.org/jira/browse/MESOS-4823
> Project: Mesos
>  Issue Type: Task
>  Components: containerization
> Environment: linux
>Reporter: Avinash Sridharan
>Assignee: Avinash Sridharan
>Priority: Critical
>  Labels: mesosphere
>
> Most docker and appc images wish to expose ports that micro-services are 
> listening on, to the outside world. When containers are running on bridged 
> (or ptp) networking this can be achieved by installing port forwarding rules 
> on the agent (using iptables). This can be done in the `network/cni` 
> isolator. 
> The reason we would like this functionality to be implemented in the 
> `network/cni` isolator, and not a CNI plugin, is that the specifications 
> currently do not support specifying port forwarding rules. Further, to 
> install these rules the isolator needs two pieces of information, the exposed 
> ports and the IP address associated with the container. Bother are available 
> to the isolator.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3902) The Location header when non-leading master redirects to leading master is incomplete.

2016-03-19 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199883#comment-15199883
 ] 

Vinod Kone commented on MESOS-3902:
---

Transition the ticket to "In Progress" since you started actively 
thinking/working on this. Once the review is out, transition the ticket to 
"Reviewable".

> The Location header when non-leading master redirects to leading master is 
> incomplete.
> --
>
> Key: MESOS-3902
> URL: https://issues.apache.org/jira/browse/MESOS-3902
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API, master
>Affects Versions: 0.25.0
> Environment: 3 masters, 10 slaves
>Reporter: Ben Whitehead
>Assignee: Ashwin Murthy
>  Labels: mesosphere
>
> The master now sets a location header, but it's incomplete. The path of the 
> URL isn't set. Consider an example:
> {code}
> > cat /tmp/subscribe-1072944352375841456 | httpp POST 
> > 127.1.0.3:5050/api/v1/scheduler Content-Type:application/x-protobuf
> POST /api/v1/scheduler HTTP/1.1
> Accept: application/json
> Accept-Encoding: gzip, deflate
> Connection: keep-alive
> Content-Length: 123
> Content-Type: application/x-protobuf
> Host: 127.1.0.3:5050
> User-Agent: HTTPie/0.9.0
> +-+
> | NOTE: binary data not shown in terminal |
> +-+
> HTTP/1.1 307 Temporary Redirect
> Content-Length: 0
> Date: Fri, 26 Feb 2016 00:54:41 GMT
> Location: //127.1.0.1:5050
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4963) Incorrect CXXFLAGS with GCC 6

2016-03-19 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15199892#comment-15199892
 ] 

Neil Conway edited comment on MESOS-4963 at 3/17/16 10:32 PM:
--

Ah, good catch. What is going on here is that GCC6 defaults to 
{{-std=gnu\+\+14}} (previously versions defaulted to {{-std=gnu\+\+98}}). For 
some reason I haven't investigated, the code in question compiles with {{-std}} 
set to {{c\+\+11}} and {{c\+\+14}}, but not {{gnu\+\+11}} or {{gnu\+\+14}}. We 
likely don't want to use {{gnu\+\+14}} anyway.


was (Author: neilc):
Ah, good catch. What is going on here is that GCC6 defaults to {{-std=gnu++14}} 
(previously versions defaulted to {{-std=gnu++98}}). For some reason I haven't 
investigated, the code in question compiles with {{-std}} set to {{c++11}} and 
{{c++14}}, but not {{gnu++11}} or {{gnu++14}}. We likely don't want to use 
{{gnu++14}} anyway.

> Incorrect CXXFLAGS with GCC 6
> -
>
> Key: MESOS-4963
> URL: https://issues.apache.org/jira/browse/MESOS-4963
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> {noformat}
> $ head config.log
> [...]
> /mesos-2/configure --enable-optimize --disable-python CC=ccache 
> /home/vagrant/local/gcc/bin/gcc CXX=ccache /home/vagrant/local/gcc/bin/g++
> $ ~/local/gcc/bin/g++ --version
> g++ (GCC) 6.0.0 20160227 (experimental)
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $ make V=0
> make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
>   CXX  appc/libmesos_no_3rdparty_la-spec.lo
> In file included from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
>  from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
>  from /mesos-2/src/appc/spec.cpp:17:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
> In instantiation of ‘int os::execlp(const char*, T ...) [with T = {const 
> char*, const char*, const char*, char*}]’:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/fork.hpp:371:52:
>required from here
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp:151:18:
>  error: missing sentinel in function call [-Werror=format=]
>return ::execlp(file, t...);
>   ^~~~
> cc1plus: all warnings being treated as errors
> Makefile:5584: recipe for target 'appc/libmesos_no_3rdparty_la-spec.lo' failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-4849) Add agent flags for HTTP authentication

2016-03-19 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-4849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15185518#comment-15185518
 ] 

Greg Mann edited comment on MESOS-4849 at 3/17/16 10:57 PM:


Reviews here:

https://reviews.apache.org/r/44678/
https://reviews.apache.org/r/44703/
https://reviews.apache.org/r/44515/
https://reviews.apache.org/r/44523/
https://reviews.apache.org/r/44989/


was (Author: greggomann):
Reviews here:

https://reviews.apache.org/r/44678/
https://reviews.apache.org/r/44703/
https://reviews.apache.org/r/44515/
https://reviews.apache.org/r/44523/

> Add agent flags for HTTP authentication
> ---
>
> Key: MESOS-4849
> URL: https://issues.apache.org/jira/browse/MESOS-4849
> Project: Mesos
>  Issue Type: Task
>  Components: security, slave
>Reporter: Adam B
>Assignee: Greg Mann
>  Labels: mesosphere, security
>
> Flags should be added to the agent to:
> 1. Enable HTTP authentication ({{--authenticate_http}})
> 2. Specify credentials ({{--http_credentials}})
> 3. Specify HTTP authenticators ({{--authenticators}})



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-4967) Oversubscription for reservation

2016-03-19 Thread Klaus Ma (JIRA)
Klaus Ma created MESOS-4967:
---

 Summary: Oversubscription for reservation
 Key: MESOS-4967
 URL: https://issues.apache.org/jira/browse/MESOS-4967
 Project: Mesos
  Issue Type: Epic
  Components: master
Reporter: Klaus Ma


Reserved resources allow frameworks and cluster operators to ensure sufficient 
resources are available when needed.  Reservations are usually made to 
guarantee there are enough resources under peak loads. Often times, reserved 
resources are not actually allocated; in other words, the frameworks do not use 
those resources and they sit reserved, but idle.

This underutilization is either an opportunity cost or a direct cost, 
particularly to the cluster operator.  Reserved but unallocated resources held 
by a Lender Framework could be optimistically offered to other frameworks, 
which we refer to as Tenant Frameworks.  When the resources are requested back 
by the Lender Framework, some of the Tenant Framework’s tasks are evicted and 
the original resource offer guarantee is preserved.

The first step is to identify when resources are reserved, but not allocated.  
We then offer these reserved resources to other frameworks, but mark these 
offered resources as revocable resources.  This allows Tenant Frameworks to use 
these resources temporarily in a 'best-effort' fashion, knowing that they could 
be revoked or reclaimed at any time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4963) Incorrect CXXFLAGS with GCC 6

2016-03-19 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4963?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-4963:
---
Summary: Incorrect CXXFLAGS with GCC 6  (was: Compile error with GCC 6)

> Incorrect CXXFLAGS with GCC 6
> -
>
> Key: MESOS-4963
> URL: https://issues.apache.org/jira/browse/MESOS-4963
> Project: Mesos
>  Issue Type: Bug
>  Components: stout
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>
> {noformat}
> $ head config.log
> [...]
> /mesos-2/configure --enable-optimize --disable-python CC=ccache 
> /home/vagrant/local/gcc/bin/gcc CXX=ccache /home/vagrant/local/gcc/bin/g++
> $ ~/local/gcc/bin/g++ --version
> g++ (GCC) 6.0.0 20160227 (experimental)
> Copyright (C) 2016 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $ make V=0
> make[2]: Entering directory '/home/vagrant/build-mesos-2-gcc6/src'
>   CXX  appc/libmesos_no_3rdparty_la-spec.lo
> In file included from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/shell.hpp:22:0,
>  from 
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os.hpp:56,
>  from /mesos-2/src/appc/spec.cpp:17:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp: 
> In instantiation of ‘int os::execlp(const char*, T ...) [with T = {const 
> char*, const char*, const char*, char*}]’:
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/fork.hpp:371:52:
>required from here
> /mesos-2/3rdparty/libprocess/3rdparty/stout/include/stout/os/posix/shell.hpp:151:18:
>  error: missing sentinel in function call [-Werror=format=]
>return ::execlp(file, t...);
>   ^~~~
> cc1plus: all warnings being treated as errors
> Makefile:5584: recipe for target 'appc/libmesos_no_3rdparty_la-spec.lo' failed
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-4967) Oversubscription for reservation

2016-03-19 Thread Klaus Ma (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-4967?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Klaus Ma updated MESOS-4967:

Labels: master  (was: )

> Oversubscription for reservation
> 
>
> Key: MESOS-4967
> URL: https://issues.apache.org/jira/browse/MESOS-4967
> Project: Mesos
>  Issue Type: Epic
>  Components: master
>Reporter: Klaus Ma
>  Labels: master
>
> Reserved resources allow frameworks and cluster operators to ensure 
> sufficient resources are available when needed.  Reservations are usually 
> made to guarantee there are enough resources under peak loads. Often times, 
> reserved resources are not actually allocated; in other words, the frameworks 
> do not use those resources and they sit reserved, but idle.
> This underutilization is either an opportunity cost or a direct cost, 
> particularly to the cluster operator.  Reserved but unallocated resources 
> held by a Lender Framework could be optimistically offered to other 
> frameworks, which we refer to as Tenant Frameworks.  When the resources are 
> requested back by the Lender Framework, some of the Tenant Framework’s tasks 
> are evicted and the original resource offer guarantee is preserved.
> The first step is to identify when resources are reserved, but not allocated. 
>  We then offer these reserved resources to other frameworks, but mark these 
> offered resources as revocable resources.  This allows Tenant Frameworks to 
> use these resources temporarily in a 'best-effort' fashion, knowing that they 
> could be revoked or reclaimed at any time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


  1   2   >