[jira] [Commented] (MESOS-2724) Support running custom commands on slaves when launching a docker container

2015-06-10 Thread chenzongzhi (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580336#comment-14580336
 ] 

chenzongzhi commented on MESOS-2724:


After see the Hook and Anonymous Module.
I find Anonymous Module is not suitable for this ticket. Because Anonymous 
Module don't inject into Mesos core,  The Anonymous Module can't get the 
container infomation

And for Hook Module, as [~lins05] said, we should add a hook after a container 
is launched.
like this way

  virtual TryNothing slaveAfterLaunchTaskHook(
  const TaskInfo taskInfo
  const SlaveInfo slaveInfo)

in this hook, we can get the cgroup id from taskInfo, and we can set the limit 
here.

My consider here is that shall we need a hook after launch a task? I think this 
is a generic requirement.

Do you guys have any suggestion? [~lins05] [~tnachen]

 Support running custom commands on slaves when launching a docker container
 ---

 Key: MESOS-2724
 URL: https://issues.apache.org/jira/browse/MESOS-2724
 Project: Mesos
  Issue Type: Improvement
  Components: framework
Reporter: chenzongzhi
  Labels: features

 We use mesos + marathon to build our Paas platform. We meet a problem
 We want to execute some command after the docker container started. such as 
 we want change the cgroup setting.
 We know We can execute some command in the  docker container, but we want 
 execute command in the host machine.
 Anyone know how to implement it or any good idea?
 Thanks



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2800) Rename OptionT::get(const T _t) to getOrElse() and refactor the original function

2015-06-10 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580195#comment-14580195
 ] 

Joris Van Remoortere commented on MESOS-2800:
-

Hey Mark. Can you add me as a reviewer on those so that they show up in my RB 
dashboard? Thanks!

 Rename OptionT::get(const T _t) to getOrElse() and refactor the original 
 function
 

 Key: MESOS-2800
 URL: https://issues.apache.org/jira/browse/MESOS-2800
 Project: Mesos
  Issue Type: Improvement
  Components: stout
Reporter: Mark Wang
Assignee: Mark Wang
Priority: Minor
  Labels: newbie

 As suggested, if we want to change the name then we should refactor the 
 original function as opposed to having 2 copies. 
 If we did have 2 versions of the same function, would it make more sense to 
 delegate one of them to the other.
 As of today, there is only one file need to be refactor: 
 3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp at line 151, 161



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2839) Segmentation fault in freeaddrinfo when used with illegal/misconfigured IP

2015-06-10 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov updated MESOS-2839:
---
Affects Version/s: 0.22.1

 Segmentation fault in freeaddrinfo when used with illegal/misconfigured IP
 --

 Key: MESOS-2839
 URL: https://issues.apache.org/jira/browse/MESOS-2839
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.22.1
Reporter: Niklas Quarfot Nielsen

 A JVM crash was triggered by a misconfigured IP:
 {code}
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7f4cdccea540, pid=15060, tid=139963129837312
 #
 # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build 
 1.7.0_75-b13)
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode 
 linux-amd64 compressed oops)
 # Problematic frame:
 # C  [libc.so.6+0xe9540]  freeaddrinfo+0x10
 #
 # Core dump written. Default location: 
 /home/arodriguez/dev/datavis-master/back/core or core.15060 (max size 5 
 kB). To ensure a full core dump, try ulimit -c unlimited before starting 
 Java again
 #
 # If you would like to submit a bug report, please visit:
 #   http://bugreport.sun.com/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
 #
 ---  T H R E A D  ---
 Current thread (0x7f4c1409a000):  JavaThread MesosSchedulerBackend 
 driver daemon [_thread_in_native, id=17085, 
 stack(0x7f4bb492b000,0x7f4bb4a2c000)]
 siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), 
 si_addr=0x
 Registers:
 RAX=0xfffe, RBX=0x80bf775da490b900, RCX=0x7f4bb4a29ee0, 
 RDX=0x
 RSP=0x7f4bb4a2a220, RBP=0x7f4bb4a2a430, RSI=0x7f4bb4a2a028, 
 RDI=0x80bf775da490b900
 R8 =0x, R9 =0x, R10=0x0020, 
 R11=0x000c
 R12=0x7f4bb4a2a660, R13=0x7f4bb4a2a270, R14=0x7f4bb4a2a330, 
 R15=0x000a
 RIP=0x7f4cdccea540, EFLAGS=0x00010286, CSGSFS=0x0033, 
 ERR=0x
   TRAPNO=0x000d
 Top of Stack: (sp=0x7f4bb4a2a220)
 0x7f4bb4a2a220:   7f4bb4a2a330 7f4bc0c8e660
 0x7f4bb4a2a230:   7f4bb4a2a430 7f4bc318db8e
 0x7f4bb4a2a240:   0004 7f4bb4a2a370
 0x7f4bb4a2a250:   7f4bb4a2a280 fffe
 0x7f4bb4a2a260:   80b44b5d5d51 7f4c9420
 0x7f4bb4a2a270:   7f4c944dad08 80bf775da490b900
 0x7f4bb4a2a280:   7f4c940285c8 7f4bc0a1ecd8
 0x7f4bb4a2a290:    7f4c94028d40
 0x7f4bb4a2a2a0:   7f4bc0c8e678 7f4bc09e09f3
 0x7f4bb4a2a2b0:   7f4c940267c8 
 0x7f4bb4a2a2c0:   7f4bc0c8e678 0006b4a2a336
 0x7f4bb4a2a2d0:   002f 7f4bc0c8dbc0
 0x7f4bb4a2a2e0:   7f4bc0c72340 7f4bc0c725d0
 0x7f4bb4a2a2f0:   7f4bb4a2a2d8 
 0x7f4bb4a2a300:   7f4bc0c72340 7f4bc0a0c600
 0x7f4bb4a2a310:   7f4c944dad98 0001
 0x7f4bb4a2a320:    80bf775da490b900
 0x7f4bb4a2a330:   7f4c944d9398 7f4bc0c8db40
 0x7f4bb4a2a340:   7f4bc0c73950 7f4bc0c725d0
 0x7f4bb4a2a350:    7f4bc0c74a80
 0x7f4bb4a2a360:   7f4bb4a2a42f 7f4bc09dd7f5
 0x7f4bb4a2a370:   0002 0001
 0x7f4bb4a2a380:    
 0x7f4bb4a2a390:    
 0x7f4bb4a2a3a0:   0002 0001
 0x7f4bb4a2a3b0:    
 0x7f4bb4a2a3c0:    
 0x7f4bb4a2a3d0:   7f4bb4a2a560 80bf775da490b900
 0x7f4bb4a2a3e0:   7f4bb4a2a488 7f4bb4a2a430
 0x7f4bb4a2a3f0:   7f4bc0c74a58 7f4bb4a2a660
 0x7f4bb4a2a400:   7f4bc0c74ac0 7f4bc0c74a80
 0x7f4bb4a2a410:   7f4bb4a2a42f 7f4bc318ed5d 
 Instructions: (pc=0x7f4cdccea540)
 0x7f4cdccea520:   e9 e7 f7 ff ff 66 66 2e 0f 1f 84 00 00 00 00 00
 0x7f4cdccea530:   55 53 48 89 fb 48 83 ec 08 48 85 ff 74 1f 66 90
 0x7f4cdccea540:   48 8b 7b 20 48 8b 6b 28 e8 73 5f f3 ff 48 89 df
 0x7f4cdccea550:   48 89 eb e8 68 5f f3 ff 48 85 ed 75 e3 48 83 c4 
 Register to memory mapping:
 RAX=0xfffe is an unallocated location in the heap
 RBX=0x80bf775da490b900 is an unknown value
 RCX=0x7f4bb4a29ee0 is pointing into the stack for thread: 
 0x7f4c1409a000
 RDX=0x is an unknown value
 RSP=0x7f4bb4a2a220 is pointing into the stack for thread: 
 0x7f4c1409a000
 RBP=0x7f4bb4a2a430 is pointing into the stack for thread: 
 0x7f4c1409a000
 

[jira] [Assigned] (MESOS-2835) Fix typos in source comments

2015-06-10 Thread Aditi Dixit (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditi Dixit reassigned MESOS-2835:
--

Assignee: Aditi Dixit

 Fix typos in source comments
 

 Key: MESOS-2835
 URL: https://issues.apache.org/jira/browse/MESOS-2835
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Aditi Dixit
Priority: Trivial
  Labels: newbie

 Review request https://reviews.apache.org/r/13709 carried a bunch of good 
 typo fixes, but has got stalled due to other code fixes and lack of 
 attention. We should create a typo review request and get these in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2768) SIGPIPE in process::run_in_event_loop()

2015-06-10 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580176#comment-14580176
 ] 

Joris Van Remoortere commented on MESOS-2768:
-

[~vinodkone]Just wanted to let you know Ben and I are looking at this.
Can you verify whether this is happening under any debugging tool like gdb / 
valgrind, etc.or is it also happening purely executing the binary?

 SIGPIPE in process::run_in_event_loop()
 ---

 Key: MESOS-2768
 URL: https://issues.apache.org/jira/browse/MESOS-2768
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Yan Xu
Priority: Critical

 Observed in production.
 {noformat:title=slave log}
 I0526 12:17:48.027257 51633 slave.cpp:4077] Received a new estimation of the 
 oversubscribable resources 
 W0526 12:17:48.027257 51636 logging.cpp:91] RAW: Received signal SIGPIPE; 
 escalating to SIGABRT
 *** Aborted at 1432642668 (unix time) try date -d @1432642668 if you are 
 using GNU date ***
 PC: @ 0x7fa58c23eb6d raise
 *** SIGABRT (@0xc9a5) received by PID 51621 (TID 0x7fa58224c940) from PID 
 51621; stack trace: ***
 @ 0x7fa58c23eca0 (unknown)
 @ 0x7fa58c23eb6d raise
 @ 0x7fa58cc19ba7 mesos::internal::logging::handler()
 @ 0x7fa58c23eca0 (unknown)
 @ 0x7fa58c23da2b __libc_write
 @ 0x7fa58cb57b6f evpipe_write.part.5
 @ 0x7fa58d245070 process::run_in_event_loop()
 @ 0x7fa58d2441ba process::EventLoop::delay()
 @ 0x7fa58d1c3c9c process::clock::scheduleTick()
 @ 0x7fa58d1c65b1 process::Clock::timer()
 @ 0x7fa58d23915a process::delay()
 @ 0x7fa58d23a740 process::ReaperProcess::wait()
 @ 0x7fa58d21261a process::ProcessManager::resume()
 @ 0x7fa58d2128dc process::schedule()
 @ 0x7fa58c23683d start_thread
 @ 0x7fa58ba28fcd clone
 {noformat}
 {noformat:title=gdb}
 (gdb) bt
 #0  0x7fa58c23eb6d in raise () from /lib64/libpthread.so.0
 #1  0x7fa58cc19ba7 in mesos::internal::logging::handler (signal=Unhandled 
 dwarf expression opcode 0xf3
 ) at logging/logging.cpp:92
 #2  signal handler called
 #3  0x7fa58c23da2b in write () from /lib64/libpthread.so.0
 #4  0x7fa58cb57b6f in evpipe_write (loop=0x7fa58e1e79c0, flag=Unhandled 
 dwarf expression opcode 0xfa
 ) at ev.c:2172
 #5  0x7fa58d245070 in process::run_in_event_loopNothing(const 
 std::functionprocess::FutureNothing() ) (f=Unhandled dwarf expression 
 opcode 0xf3
 ) at src/libev.hpp:80
 #6  0x7fa58d2441ba in process::EventLoop::delay(const Duration , const 
 std::functionvoid() ) (duration=Unhandled dwarf expression opcode 0xf3
 ) at src/libev.cpp:106
 #7  0x7fa58d1c3c9c in process::clock::scheduleTick (timers=Unhandled 
 dwarf expression opcode 0xf3
 ) at src/clock.cpp:119
 #8  0x7fa58d1c65b1 in process::Clock::timer(const Duration , const 
 std::functionvoid() ) (duration=Unhandled dwarf expression opcode 0xf3
 ) at src/clock.cpp:254
 #9  0x7fa58d23915a in process::delayprocess::ReaperProcess 
 (duration=..., pid=Unhandled dwarf expression opcode 0xf3
 ) at ./include/process/delay.hpp:25
 #10 0x7fa58d23a740 in process::ReaperProcess::wait (this=0x2056920) at 
 src/reap.cpp:93
 #11 0x7fa58d21261a in process::ProcessManager::resume (this=0x1db8d20, 
 process=0x2056958) at src/process.cpp:2172
 #12 0x7fa58d2128dc in process::schedule (arg=Unhandled dwarf expression 
 opcode 0xf3
 ) at src/process.cpp:602
 #13 0x7fa58c23683d in start_thread () from /lib64/libpthread.so.0
 #14 0x7fa58ba28fcd in clone () from /lib64/libc.so.6
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2800) Rename OptionT::get(const T _t) to getOrElse() and refactor the original function

2015-06-10 Thread Mark Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580145#comment-14580145
 ] 

Mark Wang edited comment on MESOS-2800 at 6/10/15 7:12 AM:
---

I think I miss lots of files that need to be addressed, here is a complete 
list: 
{noformat}
mesos:
  src/log/catchup.cpp
  src/master/main.cpp
  src/master/master.hpp
  src/slave/containerizer/containerizer.cpp
  src/tests/cluster.hpp
libprocess:
  3rdparty/libprocess/src/process.cpp
  3rdparty/libprocess/src/subprocess.cpp
stout:
  3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp
  3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp
  3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp
{noformat}
by lines:
$ git diff HEAD~1 --numstat
{noformat}
1   1   3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp
2   2   3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp
11  0   3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp
1   1   3rdparty/libprocess/src/process.cpp
1   1   3rdparty/libprocess/src/subprocess.cpp
6   3   src/cli/mesos-ps
1   1   src/log/catchup.cpp
2   2   src/master/main.cpp
2   2   src/master/master.hpp
5   5   src/slave/containerizer/containerizer.cpp
5   5   src/tests/cluster.hpp
{noformat}
So are we still go for the rename? If so, am I suppose to open issue in three 
projects?
What do you think [~jvanremoortere]? 


was (Author: balamark):
I think I miss lots of files that need to be addressed, here is a complete 
list: 
mesos:
{noformat}
  src/log/catchup.cpp
  src/master/main.cpp
  src/master/master.hpp
  src/slave/containerizer/containerizer.cpp
  src/tests/cluster.hpp
libprocess:
  3rdparty/libprocess/src/process.cpp
  3rdparty/libprocess/src/subprocess.cpp
stout:
  3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp
  3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp
  3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp
{noformat}
by lines:
$ git diff HEAD~1 --numstat
{noformat}
1   1   3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp
2   2   3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp
11  0   3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp
1   1   3rdparty/libprocess/src/process.cpp
1   1   3rdparty/libprocess/src/subprocess.cpp
6   3   src/cli/mesos-ps
1   1   src/log/catchup.cpp
2   2   src/master/main.cpp
2   2   src/master/master.hpp
5   5   src/slave/containerizer/containerizer.cpp
5   5   src/tests/cluster.hpp
{noformat}
So are we still go for the rename? If so, am I suppose to open issue in three 
projects?
What do you think [~jvanremoortere]? 

 Rename OptionT::get(const T _t) to getOrElse() and refactor the original 
 function
 

 Key: MESOS-2800
 URL: https://issues.apache.org/jira/browse/MESOS-2800
 Project: Mesos
  Issue Type: Improvement
  Components: stout
Reporter: Mark Wang
Assignee: Mark Wang
Priority: Minor
  Labels: newbie

 As suggested, if we want to change the name then we should refactor the 
 original function as opposed to having 2 copies. 
 If we did have 2 versions of the same function, would it make more sense to 
 delegate one of them to the other.
 As of today, there is only one file need to be refactor: 
 3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp at line 151, 161



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2800) Rename OptionT::get(const T _t) to getOrElse() and refactor the original function

2015-06-10 Thread Mark Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580145#comment-14580145
 ] 

Mark Wang commented on MESOS-2800:
--

I think I miss lots of files that need to be addressed, here is a complete 
list: 
mesos:
{noformat}
  src/log/catchup.cpp
  src/master/main.cpp
  src/master/master.hpp
  src/slave/containerizer/containerizer.cpp
  src/tests/cluster.hpp
libprocess:
  3rdparty/libprocess/src/process.cpp
  3rdparty/libprocess/src/subprocess.cpp
stout:
  3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp
  3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp
  3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp
{noformat}
by lines:
$ git diff HEAD~1 --numstat
{noformat}
1   1   3rdparty/libprocess/3rdparty/stout/include/stout/option.hpp
2   2   3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp
11  0   3rdparty/libprocess/3rdparty/stout/tests/option_tests.cpp
1   1   3rdparty/libprocess/src/process.cpp
1   1   3rdparty/libprocess/src/subprocess.cpp
6   3   src/cli/mesos-ps
1   1   src/log/catchup.cpp
2   2   src/master/main.cpp
2   2   src/master/master.hpp
5   5   src/slave/containerizer/containerizer.cpp
5   5   src/tests/cluster.hpp
{noformat}
So are we still go for the rename? If so, am I suppose to open issue in three 
projects?
What do you think [~jvanremoortere]? 

 Rename OptionT::get(const T _t) to getOrElse() and refactor the original 
 function
 

 Key: MESOS-2800
 URL: https://issues.apache.org/jira/browse/MESOS-2800
 Project: Mesos
  Issue Type: Improvement
  Components: stout
Reporter: Mark Wang
Assignee: Mark Wang
Priority: Minor
  Labels: newbie

 As suggested, if we want to change the name then we should refactor the 
 original function as opposed to having 2 copies. 
 If we did have 2 versions of the same function, would it make more sense to 
 delegate one of them to the other.
 As of today, there is only one file need to be refactor: 
 3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp at line 151, 161



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2800) Rename OptionT::get(const T _t) to getOrElse() and refactor the original function

2015-06-10 Thread Joris Van Remoortere (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580169#comment-14580169
 ] 

Joris Van Remoortere commented on MESOS-2800:
-

I think there are still not that many for it to be prohibitive.
No need for 3 separate JIRA issues. Just make 3 different patches and post all 
3 reviewboard links here :-)


 Rename OptionT::get(const T _t) to getOrElse() and refactor the original 
 function
 

 Key: MESOS-2800
 URL: https://issues.apache.org/jira/browse/MESOS-2800
 Project: Mesos
  Issue Type: Improvement
  Components: stout
Reporter: Mark Wang
Assignee: Mark Wang
Priority: Minor
  Labels: newbie

 As suggested, if we want to change the name then we should refactor the 
 original function as opposed to having 2 copies. 
 If we did have 2 versions of the same function, would it make more sense to 
 delegate one of them to the other.
 As of today, there is only one file need to be refactor: 
 3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp at line 151, 161



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2800) Rename OptionT::get(const T _t) to getOrElse() and refactor the original function

2015-06-10 Thread Mark Wang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580191#comment-14580191
 ] 

Mark Wang commented on MESOS-2800:
--

:)
https://reviews.apache.org/r/35285/
https://reviews.apache.org/r/35286/
https://reviews.apache.org/r/35287/


 Rename OptionT::get(const T _t) to getOrElse() and refactor the original 
 function
 

 Key: MESOS-2800
 URL: https://issues.apache.org/jira/browse/MESOS-2800
 Project: Mesos
  Issue Type: Improvement
  Components: stout
Reporter: Mark Wang
Assignee: Mark Wang
Priority: Minor
  Labels: newbie

 As suggested, if we want to change the name then we should refactor the 
 original function as opposed to having 2 copies. 
 If we did have 2 versions of the same function, would it make more sense to 
 delegate one of them to the other.
 As of today, there is only one file need to be refactor: 
 3rdparty/libprocess/3rdparty/stout/include/stout/os/osx.hpp at line 151, 161



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread James DeFelice (JIRA)
James DeFelice created MESOS-2841:
-

 Summary: FrameworkInfo should include a Labels field to support 
arbitrary, lightweight metadata
 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Improvement
Reporter: James DeFelice


A framework instance may offer specific capabilities to the cluster: storage, 
smartly-balanced request handling across deployed tasks, access to 3rd party 
services outside of the cluster, etc. These capabilities may or may not be 
utilized by all, or even most mesos clusters. However, it should be possible 
for processes running in the cluster to discover capabilities or features of 
frameworks in order to achieve a higher level of functionality and a more 
seamless integration experience across the cluster.

A rich discovery API attached to the FrameworkInfo could result in some form of 
early lock-in: there are probably many ways to realize cross-framework 
integration and external services integration that we haven't considered yet. 
Rather than over-specify a discovery info message type at the framework level I 
think FrameworkInfo should expose a **very generic** way to supply metadata for 
interested consumers (other processes, tasks, etc).

Adding a Labels field to FrameworkInfo reuses an existing message type and 
seems to fit well with the overall intent: attaching generic metadata to a 
framework instance. These labels should be visible when querying a mesos 
master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2839) Segmentation fault in freeaddrinfo when used with illegal/misconfigured IP

2015-06-10 Thread Alberto (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580094#comment-14580094
 ] 

Alberto commented on MESOS-2839:


I got this error using Mesos 0.22.1

 Segmentation fault in freeaddrinfo when used with illegal/misconfigured IP
 --

 Key: MESOS-2839
 URL: https://issues.apache.org/jira/browse/MESOS-2839
 Project: Mesos
  Issue Type: Bug
Reporter: Niklas Quarfot Nielsen

 A JVM crash was triggered by a misconfigured IP:
 {code}
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7f4cdccea540, pid=15060, tid=139963129837312
 #
 # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build 
 1.7.0_75-b13)
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode 
 linux-amd64 compressed oops)
 # Problematic frame:
 # C  [libc.so.6+0xe9540]  freeaddrinfo+0x10
 #
 # Core dump written. Default location: 
 /home/arodriguez/dev/datavis-master/back/core or core.15060 (max size 5 
 kB). To ensure a full core dump, try ulimit -c unlimited before starting 
 Java again
 #
 # If you would like to submit a bug report, please visit:
 #   http://bugreport.sun.com/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
 #
 ---  T H R E A D  ---
 Current thread (0x7f4c1409a000):  JavaThread MesosSchedulerBackend 
 driver daemon [_thread_in_native, id=17085, 
 stack(0x7f4bb492b000,0x7f4bb4a2c000)]
 siginfo:si_signo=SIGSEGV: si_errno=0, si_code=128 (), 
 si_addr=0x
 Registers:
 RAX=0xfffe, RBX=0x80bf775da490b900, RCX=0x7f4bb4a29ee0, 
 RDX=0x
 RSP=0x7f4bb4a2a220, RBP=0x7f4bb4a2a430, RSI=0x7f4bb4a2a028, 
 RDI=0x80bf775da490b900
 R8 =0x, R9 =0x, R10=0x0020, 
 R11=0x000c
 R12=0x7f4bb4a2a660, R13=0x7f4bb4a2a270, R14=0x7f4bb4a2a330, 
 R15=0x000a
 RIP=0x7f4cdccea540, EFLAGS=0x00010286, CSGSFS=0x0033, 
 ERR=0x
   TRAPNO=0x000d
 Top of Stack: (sp=0x7f4bb4a2a220)
 0x7f4bb4a2a220:   7f4bb4a2a330 7f4bc0c8e660
 0x7f4bb4a2a230:   7f4bb4a2a430 7f4bc318db8e
 0x7f4bb4a2a240:   0004 7f4bb4a2a370
 0x7f4bb4a2a250:   7f4bb4a2a280 fffe
 0x7f4bb4a2a260:   80b44b5d5d51 7f4c9420
 0x7f4bb4a2a270:   7f4c944dad08 80bf775da490b900
 0x7f4bb4a2a280:   7f4c940285c8 7f4bc0a1ecd8
 0x7f4bb4a2a290:    7f4c94028d40
 0x7f4bb4a2a2a0:   7f4bc0c8e678 7f4bc09e09f3
 0x7f4bb4a2a2b0:   7f4c940267c8 
 0x7f4bb4a2a2c0:   7f4bc0c8e678 0006b4a2a336
 0x7f4bb4a2a2d0:   002f 7f4bc0c8dbc0
 0x7f4bb4a2a2e0:   7f4bc0c72340 7f4bc0c725d0
 0x7f4bb4a2a2f0:   7f4bb4a2a2d8 
 0x7f4bb4a2a300:   7f4bc0c72340 7f4bc0a0c600
 0x7f4bb4a2a310:   7f4c944dad98 0001
 0x7f4bb4a2a320:    80bf775da490b900
 0x7f4bb4a2a330:   7f4c944d9398 7f4bc0c8db40
 0x7f4bb4a2a340:   7f4bc0c73950 7f4bc0c725d0
 0x7f4bb4a2a350:    7f4bc0c74a80
 0x7f4bb4a2a360:   7f4bb4a2a42f 7f4bc09dd7f5
 0x7f4bb4a2a370:   0002 0001
 0x7f4bb4a2a380:    
 0x7f4bb4a2a390:    
 0x7f4bb4a2a3a0:   0002 0001
 0x7f4bb4a2a3b0:    
 0x7f4bb4a2a3c0:    
 0x7f4bb4a2a3d0:   7f4bb4a2a560 80bf775da490b900
 0x7f4bb4a2a3e0:   7f4bb4a2a488 7f4bb4a2a430
 0x7f4bb4a2a3f0:   7f4bc0c74a58 7f4bb4a2a660
 0x7f4bb4a2a400:   7f4bc0c74ac0 7f4bc0c74a80
 0x7f4bb4a2a410:   7f4bb4a2a42f 7f4bc318ed5d 
 Instructions: (pc=0x7f4cdccea540)
 0x7f4cdccea520:   e9 e7 f7 ff ff 66 66 2e 0f 1f 84 00 00 00 00 00
 0x7f4cdccea530:   55 53 48 89 fb 48 83 ec 08 48 85 ff 74 1f 66 90
 0x7f4cdccea540:   48 8b 7b 20 48 8b 6b 28 e8 73 5f f3 ff 48 89 df
 0x7f4cdccea550:   48 89 eb e8 68 5f f3 ff 48 85 ed 75 e3 48 83 c4 
 Register to memory mapping:
 RAX=0xfffe is an unallocated location in the heap
 RBX=0x80bf775da490b900 is an unknown value
 RCX=0x7f4bb4a29ee0 is pointing into the stack for thread: 
 0x7f4c1409a000
 RDX=0x is an unknown value
 RSP=0x7f4bb4a2a220 is pointing into the stack for thread: 
 0x7f4c1409a000
 RBP=0x7f4bb4a2a430 is pointing into the stack for thread: 
 0x7f4c1409a000
 

[jira] [Resolved] (MESOS-2820) SIGSEGV received on start mesos scheduler driver

2015-06-10 Thread Alberto (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alberto resolved MESOS-2820.

Resolution: Duplicate

See #2839

 SIGSEGV received on start mesos scheduler driver
 

 Key: MESOS-2820
 URL: https://issues.apache.org/jira/browse/MESOS-2820
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.22.1
 Environment: Linux Fedora
Reporter: Alberto

 I'm trying to create a CassandraSQLContext using the datastax 
 spark-cassandra-connector. My SparkContext is configured to use a locally 
 running mesos cluster. 
 When I create the CassandraSQLContext using proper parameters everything 
 works fine but when I pass a non existing or wrong mesos cluster ip. I'm 
 getting a SIGSEGV signal and my VM is being shutdown. See following the trace 
 of the error:
 ABOUT TO CREATE THE CASSANDRA SQL CONTEXT...
 [Dynamic-linking native method java.lang.Class.getCheckMemberAccessMethod ... 
 JNI]
 [Dynamic-linking native method java.net.NetworkInterface.init ... JNI]
 [Dynamic-linking native method java.net.NetworkInterface.getAll ... JNI]
 [Dynamic-linking native method java.net.Inet6AddressImpl.getHostByAddr ... 
 JNI]
 [Dynamic-linking native method java.lang.UNIXProcess.init ... JNI]
 [Dynamic-linking native method java.lang.UNIXProcess.forkAndExec ... JNI]
 [Dynamic-linking native method java.lang.UNIXProcess.waitForProcessExit ... 
 JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketCreate ... JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketBind ... JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketListen ... JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketSetOption ... 
 JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketAccept ... JNI]
 [Dynamic-linking native method sun.nio.ch.ServerSocketChannelImpl.accept0 ... 
 JNI]
 [Dynamic-linking native method 
 org.apache.mesos.MesosSchedulerDriver.initialize ... JNI]
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 W0605 13:43:12.363229 25629 sched.cpp:1323] 
 **
 Scheduler driver bound to loopback interface! Cannot communicate with remote 
 master(s). You might want to set 'LIBPROCESS_IP' environment variable to use 
 a routable IP address.
 **
 [Dynamic-linking native method org.apache.mesos.MesosSchedulerDriver.start 
 ... JNI]
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7f537a894495, pid=25488, tid=139995558737664
 #
 # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build 
 1.7.0_75-b13)
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode 
 linux-amd64 compressed oops)
 # Problematic frame:
 # C  [libc.so.6+0x83495]  cfree+0x55
 #
 # Core dump written. Default location: 
 /home/arodriguez/dev/datavis-master/back/core or core.25488 (max size 5 
 kB). To ensure a full core dump, try ulimit -c unlimited before starting 
 Java again
 #
 # An error report file with more information is saved as:
 # /home/arodriguez/dev/datavis-master/back/hs_err_pid25488.log
 #
 # If you would like to submit a bug report, please visit:
 #   http://bugreport.sun.com/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
 #



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2820) SIGSEGV received on start mesos scheduler driver

2015-06-10 Thread Alberto (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580133#comment-14580133
 ] 

Alberto edited comment on MESOS-2820 at 6/10/15 6:56 AM:
-

See MESOS-2839


was (Author: ardlema):
See #2839

 SIGSEGV received on start mesos scheduler driver
 

 Key: MESOS-2820
 URL: https://issues.apache.org/jira/browse/MESOS-2820
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.22.1
 Environment: Linux Fedora
Reporter: Alberto

 I'm trying to create a CassandraSQLContext using the datastax 
 spark-cassandra-connector. My SparkContext is configured to use a locally 
 running mesos cluster. 
 When I create the CassandraSQLContext using proper parameters everything 
 works fine but when I pass a non existing or wrong mesos cluster ip. I'm 
 getting a SIGSEGV signal and my VM is being shutdown. See following the trace 
 of the error:
 ABOUT TO CREATE THE CASSANDRA SQL CONTEXT...
 [Dynamic-linking native method java.lang.Class.getCheckMemberAccessMethod ... 
 JNI]
 [Dynamic-linking native method java.net.NetworkInterface.init ... JNI]
 [Dynamic-linking native method java.net.NetworkInterface.getAll ... JNI]
 [Dynamic-linking native method java.net.Inet6AddressImpl.getHostByAddr ... 
 JNI]
 [Dynamic-linking native method java.lang.UNIXProcess.init ... JNI]
 [Dynamic-linking native method java.lang.UNIXProcess.forkAndExec ... JNI]
 [Dynamic-linking native method java.lang.UNIXProcess.waitForProcessExit ... 
 JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketCreate ... JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketBind ... JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketListen ... JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketSetOption ... 
 JNI]
 [Dynamic-linking native method java.net.PlainSocketImpl.socketAccept ... JNI]
 [Dynamic-linking native method sun.nio.ch.ServerSocketChannelImpl.accept0 ... 
 JNI]
 [Dynamic-linking native method 
 org.apache.mesos.MesosSchedulerDriver.initialize ... JNI]
 WARNING: Logging before InitGoogleLogging() is written to STDERR
 W0605 13:43:12.363229 25629 sched.cpp:1323] 
 **
 Scheduler driver bound to loopback interface! Cannot communicate with remote 
 master(s). You might want to set 'LIBPROCESS_IP' environment variable to use 
 a routable IP address.
 **
 [Dynamic-linking native method org.apache.mesos.MesosSchedulerDriver.start 
 ... JNI]
 #
 # A fatal error has been detected by the Java Runtime Environment:
 #
 #  SIGSEGV (0xb) at pc=0x7f537a894495, pid=25488, tid=139995558737664
 #
 # JRE version: Java(TM) SE Runtime Environment (7.0_75-b13) (build 
 1.7.0_75-b13)
 # Java VM: Java HotSpot(TM) 64-Bit Server VM (24.75-b04 mixed mode 
 linux-amd64 compressed oops)
 # Problematic frame:
 # C  [libc.so.6+0x83495]  cfree+0x55
 #
 # Core dump written. Default location: 
 /home/arodriguez/dev/datavis-master/back/core or core.25488 (max size 5 
 kB). To ensure a full core dump, try ulimit -c unlimited before starting 
 Java again
 #
 # An error report file with more information is saved as:
 # /home/arodriguez/dev/datavis-master/back/hs_err_pid25488.log
 #
 # If you would like to submit a bug report, please visit:
 #   http://bugreport.sun.com/bugreport/crash.jsp
 # The crash happened outside the Java Virtual Machine in native code.
 # See problematic frame for where to report the bug.
 #



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2845) Command tasks lead to a mixing of revocable / non-revocable cpus and memory within the container.

2015-06-10 Thread Ian Downes (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581024#comment-14581024
 ] 

Ian Downes commented on MESOS-2845:
---

+1 to removing the hack.

In the interim, can we simple match the executor's resource 
revocable/non-revocable state to that of the task's resources? IIRC the command 
executor can only run a single task anyway.

 Command tasks lead to a mixing of revocable / non-revocable cpus and memory 
 within the container.
 -

 Key: MESOS-2845
 URL: https://issues.apache.org/jira/browse/MESOS-2845
 Project: Mesos
  Issue Type: Bug
  Components: slave
Reporter: Benjamin Mahler
  Labels: twitter

 Due to the hack 
 [here|https://github.com/apache/mesos/blob/9a5788801e7fc95fce99749a23803fc52c67c0ce/src/slave/slave.cpp#L3101],
  where we add a small set of resources into the command executor:
 {code}
 ExecutorInfo Slave::getExecutorInfo(
 const FrameworkID frameworkId,
 const TaskInfo task)
 {
   if (task.has_command()) {
 ...
 // XXX: These are always non-revocable.
 // Add an allowance for the command executor. This does lead to a
 // small overcommit of resources.
 executor.mutable_resources()-MergeFrom(
 Resources::parse(
   cpus: + stringify(DEFAULT_EXECUTOR_CPUS) + ; +
   mem: + stringify(DEFAULT_EXECUTOR_MEM.megabytes())).get());
   }
   ...
 }
 {code}
 The obvious extension here would be to make these revocable, but would be 
 great to remove this hack entirely.
 Seems to originate in [r/22251|https://reviews.apache.org/r/22251/] from 
 MESOS-1417.
 FYI [~idownes] [~jieyu]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners

2015-06-10 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-2840:

Description: 
We want to utilize the Appc integration interfaces to further make 
MesosContainerizers to support multiple image formats.
This allows our future work on isolators to support any container image format.

 MesosContainerizer support multiple image provisioners
 --

 Key: MESOS-2840
 URL: https://issues.apache.org/jira/browse/MESOS-2840
 Project: Mesos
  Issue Type: Epic
  Components: containerization, docker
Affects Versions: 0.23.0
Reporter: Marco Massenzio
Assignee: Timothy Chen
  Labels: mesosphere

 We want to utilize the Appc integration interfaces to further make 
 MesosContainerizers to support multiple image formats.
 This allows our future work on isolators to support any container image 
 format.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2846) Refactor passing and instantiating multiple provisioners

2015-06-10 Thread Timothy Chen (JIRA)
Timothy Chen created MESOS-2846:
---

 Summary: Refactor passing and instantiating multiple provisioners
 Key: MESOS-2846
 URL: https://issues.apache.org/jira/browse/MESOS-2846
 Project: Mesos
  Issue Type: Improvement
Reporter: Timothy Chen
Assignee: Ian Downes






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2845) Command tasks lead to a mixing of revocable / non-revocable cpus and memory within the container.

2015-06-10 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581054#comment-14581054
 ] 

Vinod Kone commented on MESOS-2845:
---

In the interim, we don't have to change anything because the task's resources 
are added to executor info right before launching.

+1 for fixing the hack in long term, probably by transferring task's resources 
to executor.

 Command tasks lead to a mixing of revocable / non-revocable cpus and memory 
 within the container.
 -

 Key: MESOS-2845
 URL: https://issues.apache.org/jira/browse/MESOS-2845
 Project: Mesos
  Issue Type: Bug
  Components: slave
Reporter: Benjamin Mahler
  Labels: twitter

 Due to the hack 
 [here|https://github.com/apache/mesos/blob/9a5788801e7fc95fce99749a23803fc52c67c0ce/src/slave/slave.cpp#L3101],
  where we add a small set of resources into the command executor:
 {code}
 ExecutorInfo Slave::getExecutorInfo(
 const FrameworkID frameworkId,
 const TaskInfo task)
 {
   if (task.has_command()) {
 ...
 // XXX: These are always non-revocable.
 // Add an allowance for the command executor. This does lead to a
 // small overcommit of resources.
 executor.mutable_resources()-MergeFrom(
 Resources::parse(
   cpus: + stringify(DEFAULT_EXECUTOR_CPUS) + ; +
   mem: + stringify(DEFAULT_EXECUTOR_MEM.megabytes())).get());
   }
   ...
 }
 {code}
 The obvious extension here would be to make these revocable, but would be 
 great to remove this hack entirely.
 Seems to originate in [r/22251|https://reviews.apache.org/r/22251/] from 
 MESOS-1417.
 FYI [~idownes] [~jieyu]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-2753) Master should validate tasks using oversubscribed resources

2015-06-10 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568210#comment-14568210
 ] 

Vinod Kone edited comment on MESOS-2753 at 6/10/15 8:03 PM:


https://reviews.apache.org/r/35309/
https://reviews.apache.org/r/34910/
https://reviews.apache.org/r/34911/



was (Author: vinodkone):
https://reviews.apache.org/r/34910/
https://reviews.apache.org/r/34911/

 Master should validate tasks using oversubscribed resources
 ---

 Key: MESOS-2753
 URL: https://issues.apache.org/jira/browse/MESOS-2753
 Project: Mesos
  Issue Type: Task
  Components: isolation, master
Affects Versions: 0.23.0
Reporter: Ian Downes
Assignee: Vinod Kone
  Labels: twitter

 Current implementation out for [review|https://reviews.apache.org/r/34310] 
 only supports setting the priority of containers with revocable CPU if it's 
 specified in the initial executor info resources. This should be enforced at 
 the master.
 Also master should make sure that oversubscribed resources used by the task 
 are valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2848) Local filesystem docker image discovery

2015-06-10 Thread Timothy Chen (JIRA)
Timothy Chen created MESOS-2848:
---

 Summary: Local filesystem docker image discovery
 Key: MESOS-2848
 URL: https://issues.apache.org/jira/browse/MESOS-2848
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Timothy Chen
Assignee: Timothy Chen






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2497) Create synchronous validations for Calls

2015-06-10 Thread Isabel Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2497:
--
Sprint: Mesosphere Sprint 12

 Create synchronous validations for Calls
 

 Key: MESOS-2497
 URL: https://issues.apache.org/jira/browse/MESOS-2497
 Project: Mesos
  Issue Type: Bug
Reporter: Isabel Jimenez
Assignee: Isabel Jimenez
  Labels: HTTP, mesosphere

 /call endpoint will return a 202 accepted code but has to do some basic 
 validations before. In case of invalidation it will return a 4xx code. We 
 have to create a mechanism that will validate the 'request' and send back the 
 appropriate code.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-2841:
--
Remaining Estimate: (was: 8m)
 Original Estimate: (was: 8m)
  Story Points: 8

 FrameworkInfo should include a Labels field to support arbitrary, lightweight 
 metadata
 --

 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Improvement
Reporter: James DeFelice

 A framework instance may offer specific capabilities to the cluster: storage, 
 smartly-balanced request handling across deployed tasks, access to 3rd party 
 services outside of the cluster, etc. These capabilities may or may not be 
 utilized by all, or even most mesos clusters. However, it should be possible 
 for processes running in the cluster to discover capabilities or features of 
 frameworks in order to achieve a higher level of functionality and a more 
 seamless integration experience across the cluster.
 A rich discovery API attached to the FrameworkInfo could result in some form 
 of early lock-in: there are probably many ways to realize cross-framework 
 integration and external services integration that we haven't considered yet. 
 Rather than over-specify a discovery info message type at the framework level 
 I think FrameworkInfo should expose a **very generic** way to supply metadata 
 for interested consumers (other processes, tasks, etc).
 Adding a Labels field to FrameworkInfo reuses an existing message type and 
 seems to fit well with the overall intent: attaching generic metadata to a 
 framework instance. These labels should be visible when querying a mesos 
 master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2775) Slave should expose metrics about oversubscribed resources

2015-06-10 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580990#comment-14580990
 ] 

Benjamin Mahler commented on MESOS-2775:


https://reviews.apache.org/r/35312/
https://reviews.apache.org/r/35313/

 Slave should expose metrics about oversubscribed resources
 --

 Key: MESOS-2775
 URL: https://issues.apache.org/jira/browse/MESOS-2775
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Benjamin Mahler
  Labels: twitter

 metrics/snapshot should expose metrics on oversubscribed resources (allocated 
 and available). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2845) Command tasks lead to a mixing of revocable / non-revocable cpus and memory within the container.

2015-06-10 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-2845:
--

 Summary: Command tasks lead to a mixing of revocable / 
non-revocable cpus and memory within the container.
 Key: MESOS-2845
 URL: https://issues.apache.org/jira/browse/MESOS-2845
 Project: Mesos
  Issue Type: Bug
  Components: slave
Reporter: Benjamin Mahler


Due to the hack 
[here|https://github.com/apache/mesos/blob/9a5788801e7fc95fce99749a23803fc52c67c0ce/src/slave/slave.cpp#L3101],
 where we add a small set of resources into the command executor:

{code}
ExecutorInfo Slave::getExecutorInfo(
const FrameworkID frameworkId,
const TaskInfo task)
{
  if (task.has_command()) {
...

// XXX: These are always non-revocable.

// Add an allowance for the command executor. This does lead to a
// small overcommit of resources.
executor.mutable_resources()-MergeFrom(
Resources::parse(
  cpus: + stringify(DEFAULT_EXECUTOR_CPUS) + ; +
  mem: + stringify(DEFAULT_EXECUTOR_MEM.megabytes())).get());
  }
  ...
}
{code}

The obvious extension here would be to make these revocable, but would be great 
to remove this hack entirely.

Seems to originate in [r/22251|https://reviews.apache.org/r/22251/] from 
MESOS-1417.

FYI [~idownes] [~jieyu]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2840) Support multiple containers

2015-06-10 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-2840:

Description: (was: See the [Design Doc|http://docs.google.com]
({{TODO: add doc link}}))

 Support multiple containers
 ---

 Key: MESOS-2840
 URL: https://issues.apache.org/jira/browse/MESOS-2840
 Project: Mesos
  Issue Type: Epic
  Components: containerization, docker
Affects Versions: 0.23.0
Reporter: Marco Massenzio
Assignee: Timothy Chen
  Labels: mesosphere





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2847) User can specify image provisioners and provisioning backends

2015-06-10 Thread Timothy Chen (JIRA)
Timothy Chen created MESOS-2847:
---

 Summary: User can specify image provisioners and provisioning 
backends
 Key: MESOS-2847
 URL: https://issues.apache.org/jira/browse/MESOS-2847
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Timothy Chen
Assignee: Ian Downes


Support new flags so users can pass in provisioners and backends



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2840) MesosContainerizer support multiple image provisioners

2015-06-10 Thread Timothy Chen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2840?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Timothy Chen updated MESOS-2840:

Summary: MesosContainerizer support multiple image provisioners  (was: 
Support multiple containers)

 MesosContainerizer support multiple image provisioners
 --

 Key: MESOS-2840
 URL: https://issues.apache.org/jira/browse/MESOS-2840
 Project: Mesos
  Issue Type: Epic
  Components: containerization, docker
Affects Versions: 0.23.0
Reporter: Marco Massenzio
Assignee: Timothy Chen
  Labels: mesosphere





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2849) Implement Docker image store

2015-06-10 Thread Timothy Chen (JIRA)
Timothy Chen created MESOS-2849:
---

 Summary: Implement Docker image store
 Key: MESOS-2849
 URL: https://issues.apache.org/jira/browse/MESOS-2849
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: Timothy Chen
Assignee: Timothy Chen






--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MESOS-1825) Support the webui over HTTPS.

2015-06-10 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone resolved MESOS-1825.
---
   Resolution: Fixed
Fix Version/s: 0.23.0

commit f55dc981889ac45775bb4bb023695e8a0ed9799d
Author: Oliver Nicholas b...@wonlove.net
Date:   Wed Jun 10 13:51:46 2015 -0700

Make a bunch of JSONP callback URLs in mesos UI protocol-relative.

Review: https://reviews.apache.org/r/35270


 Support the webui over HTTPS.
 -

 Key: MESOS-1825
 URL: https://issues.apache.org/jira/browse/MESOS-1825
 Project: Mesos
  Issue Type: Bug
  Components: webui
Reporter: Kien Pham
Assignee: Oliver Nicholas
Priority: Minor
  Labels: newbie
 Fix For: 0.23.0


 Right now at Mesos UI, link are hardcoded to http:// . It should not be 
 hardcoded so that it can support https link.
 Ex:
 https://github.com/apache/mesos/blob/master/src/webui/master/static/js/controllers.js#L17



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1825) Support the webui over HTTPS.

2015-06-10 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-1825:
--
Assignee: Oliver Nicholas

 Support the webui over HTTPS.
 -

 Key: MESOS-1825
 URL: https://issues.apache.org/jira/browse/MESOS-1825
 Project: Mesos
  Issue Type: Bug
  Components: webui
Reporter: Kien Pham
Assignee: Oliver Nicholas
Priority: Minor
  Labels: newbie

 Right now at Mesos UI, link are hardcoded to http:// . It should not be 
 hardcoded so that it can support https link.
 Ex:
 https://github.com/apache/mesos/blob/master/src/webui/master/static/js/controllers.js#L17



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2720) Implement protobufs for master operator endpoints

2015-06-10 Thread Isabel Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2720:
--
Assignee: (was: Isabel Jimenez)

 Implement protobufs for master operator endpoints
 -

 Key: MESOS-2720
 URL: https://issues.apache.org/jira/browse/MESOS-2720
 Project: Mesos
  Issue Type: Improvement
Reporter: Isabel Jimenez

 We should define protobufs for master operator endpoints so as to provide a 
 structure we can refer to for each possible return from an endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-2841:
--
Labels: mesosphere  (was: )

 FrameworkInfo should include a Labels field to support arbitrary, lightweight 
 metadata
 --

 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Improvement
Reporter: James DeFelice
  Labels: mesosphere

 A framework instance may offer specific capabilities to the cluster: storage, 
 smartly-balanced request handling across deployed tasks, access to 3rd party 
 services outside of the cluster, etc. These capabilities may or may not be 
 utilized by all, or even most mesos clusters. However, it should be possible 
 for processes running in the cluster to discover capabilities or features of 
 frameworks in order to achieve a higher level of functionality and a more 
 seamless integration experience across the cluster.
 A rich discovery API attached to the FrameworkInfo could result in some form 
 of early lock-in: there are probably many ways to realize cross-framework 
 integration and external services integration that we haven't considered yet. 
 Rather than over-specify a discovery info message type at the framework level 
 I think FrameworkInfo should expose a **very generic** way to supply metadata 
 for interested consumers (other processes, tasks, etc).
 Adding a Labels field to FrameworkInfo reuses an existing message type and 
 seems to fit well with the overall intent: attaching generic metadata to a 
 framework instance. These labels should be visible when querying a mesos 
 master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-2841:
--
Issue Type: Epic  (was: Improvement)

 FrameworkInfo should include a Labels field to support arbitrary, lightweight 
 metadata
 --

 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Epic
Reporter: James DeFelice
  Labels: mesosphere

 A framework instance may offer specific capabilities to the cluster: storage, 
 smartly-balanced request handling across deployed tasks, access to 3rd party 
 services outside of the cluster, etc. These capabilities may or may not be 
 utilized by all, or even most mesos clusters. However, it should be possible 
 for processes running in the cluster to discover capabilities or features of 
 frameworks in order to achieve a higher level of functionality and a more 
 seamless integration experience across the cluster.
 A rich discovery API attached to the FrameworkInfo could result in some form 
 of early lock-in: there are probably many ways to realize cross-framework 
 integration and external services integration that we haven't considered yet. 
 Rather than over-specify a discovery info message type at the framework level 
 I think FrameworkInfo should expose a **very generic** way to supply metadata 
 for interested consumers (other processes, tasks, etc).
 Adding a Labels field to FrameworkInfo reuses an existing message type and 
 seems to fit well with the overall intent: attaching generic metadata to a 
 framework instance. These labels should be visible when querying a mesos 
 master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2719) Removing '.json' extension in master endpoints url

2015-06-10 Thread Isabel Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2719?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2719:
--
Sprint: Mesosphere Sprint 12

 Removing '.json' extension in master endpoints url
 --

 Key: MESOS-2719
 URL: https://issues.apache.org/jira/browse/MESOS-2719
 Project: Mesos
  Issue Type: Improvement
Reporter: Isabel Jimenez
Assignee: Isabel Jimenez
  Labels: HTTP

 Remove the '.json' extension on endpoints such as `/master/stats.json` so it 
 become `/master/stats`



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-2841:
--
Epic Name: framework-labels

 FrameworkInfo should include a Labels field to support arbitrary, lightweight 
 metadata
 --

 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Epic
Reporter: James DeFelice
  Labels: mesosphere

 A framework instance may offer specific capabilities to the cluster: storage, 
 smartly-balanced request handling across deployed tasks, access to 3rd party 
 services outside of the cluster, etc. These capabilities may or may not be 
 utilized by all, or even most mesos clusters. However, it should be possible 
 for processes running in the cluster to discover capabilities or features of 
 frameworks in order to achieve a higher level of functionality and a more 
 seamless integration experience across the cluster.
 A rich discovery API attached to the FrameworkInfo could result in some form 
 of early lock-in: there are probably many ways to realize cross-framework 
 integration and external services integration that we haven't considered yet. 
 Rather than over-specify a discovery info message type at the framework level 
 I think FrameworkInfo should expose a **very generic** way to supply metadata 
 for interested consumers (other processes, tasks, etc).
 Adding a Labels field to FrameworkInfo reuses an existing message type and 
 seems to fit well with the overall intent: attaching generic metadata to a 
 framework instance. These labels should be visible when querying a mesos 
 master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2768) SIGPIPE in process::run_in_event_loop()

2015-06-10 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580855#comment-14580855
 ] 

Vinod Kone commented on MESOS-2768:
---

This is happening to us in production, when purely executing the binary.

 SIGPIPE in process::run_in_event_loop()
 ---

 Key: MESOS-2768
 URL: https://issues.apache.org/jira/browse/MESOS-2768
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Yan Xu
Priority: Critical

 Observed in production.
 {noformat:title=slave log}
 I0526 12:17:48.027257 51633 slave.cpp:4077] Received a new estimation of the 
 oversubscribable resources 
 W0526 12:17:48.027257 51636 logging.cpp:91] RAW: Received signal SIGPIPE; 
 escalating to SIGABRT
 *** Aborted at 1432642668 (unix time) try date -d @1432642668 if you are 
 using GNU date ***
 PC: @ 0x7fa58c23eb6d raise
 *** SIGABRT (@0xc9a5) received by PID 51621 (TID 0x7fa58224c940) from PID 
 51621; stack trace: ***
 @ 0x7fa58c23eca0 (unknown)
 @ 0x7fa58c23eb6d raise
 @ 0x7fa58cc19ba7 mesos::internal::logging::handler()
 @ 0x7fa58c23eca0 (unknown)
 @ 0x7fa58c23da2b __libc_write
 @ 0x7fa58cb57b6f evpipe_write.part.5
 @ 0x7fa58d245070 process::run_in_event_loop()
 @ 0x7fa58d2441ba process::EventLoop::delay()
 @ 0x7fa58d1c3c9c process::clock::scheduleTick()
 @ 0x7fa58d1c65b1 process::Clock::timer()
 @ 0x7fa58d23915a process::delay()
 @ 0x7fa58d23a740 process::ReaperProcess::wait()
 @ 0x7fa58d21261a process::ProcessManager::resume()
 @ 0x7fa58d2128dc process::schedule()
 @ 0x7fa58c23683d start_thread
 @ 0x7fa58ba28fcd clone
 {noformat}
 {noformat:title=gdb}
 (gdb) bt
 #0  0x7fa58c23eb6d in raise () from /lib64/libpthread.so.0
 #1  0x7fa58cc19ba7 in mesos::internal::logging::handler (signal=Unhandled 
 dwarf expression opcode 0xf3
 ) at logging/logging.cpp:92
 #2  signal handler called
 #3  0x7fa58c23da2b in write () from /lib64/libpthread.so.0
 #4  0x7fa58cb57b6f in evpipe_write (loop=0x7fa58e1e79c0, flag=Unhandled 
 dwarf expression opcode 0xfa
 ) at ev.c:2172
 #5  0x7fa58d245070 in process::run_in_event_loopNothing(const 
 std::functionprocess::FutureNothing() ) (f=Unhandled dwarf expression 
 opcode 0xf3
 ) at src/libev.hpp:80
 #6  0x7fa58d2441ba in process::EventLoop::delay(const Duration , const 
 std::functionvoid() ) (duration=Unhandled dwarf expression opcode 0xf3
 ) at src/libev.cpp:106
 #7  0x7fa58d1c3c9c in process::clock::scheduleTick (timers=Unhandled 
 dwarf expression opcode 0xf3
 ) at src/clock.cpp:119
 #8  0x7fa58d1c65b1 in process::Clock::timer(const Duration , const 
 std::functionvoid() ) (duration=Unhandled dwarf expression opcode 0xf3
 ) at src/clock.cpp:254
 #9  0x7fa58d23915a in process::delayprocess::ReaperProcess 
 (duration=..., pid=Unhandled dwarf expression opcode 0xf3
 ) at ./include/process/delay.hpp:25
 #10 0x7fa58d23a740 in process::ReaperProcess::wait (this=0x2056920) at 
 src/reap.cpp:93
 #11 0x7fa58d21261a in process::ProcessManager::resume (this=0x1db8d20, 
 process=0x2056958) at src/process.cpp:2172
 #12 0x7fa58d2128dc in process::schedule (arg=Unhandled dwarf expression 
 opcode 0xf3
 ) at src/process.cpp:602
 #13 0x7fa58c23683d in start_thread () from /lib64/libpthread.so.0
 #14 0x7fa58ba28fcd in clone () from /lib64/libc.so.6
 {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2551) C++ Scheduler library should send Call messages to Master

2015-06-10 Thread Isabel Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2551:
--
Issue Type: Story  (was: Bug)

 C++ Scheduler library should send Call messages to Master
 -

 Key: MESOS-2551
 URL: https://issues.apache.org/jira/browse/MESOS-2551
 Project: Mesos
  Issue Type: Story
Reporter: Vinod Kone
Assignee: Isabel Jimenez

 Currently, the C++ library sends different messages to Master instead of a 
 single Call message. To vet the new Call API it should send Call messages. 
 Master should be updated to handle all types of Calls.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread Niklas Quarfot Nielsen (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Niklas Quarfot Nielsen updated MESOS-2841:
--
Remaining Estimate: 8m
 Original Estimate: 8m

 FrameworkInfo should include a Labels field to support arbitrary, lightweight 
 metadata
 --

 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Improvement
Reporter: James DeFelice
   Original Estimate: 8m
  Remaining Estimate: 8m

 A framework instance may offer specific capabilities to the cluster: storage, 
 smartly-balanced request handling across deployed tasks, access to 3rd party 
 services outside of the cluster, etc. These capabilities may or may not be 
 utilized by all, or even most mesos clusters. However, it should be possible 
 for processes running in the cluster to discover capabilities or features of 
 frameworks in order to achieve a higher level of functionality and a more 
 seamless integration experience across the cluster.
 A rich discovery API attached to the FrameworkInfo could result in some form 
 of early lock-in: there are probably many ways to realize cross-framework 
 integration and external services integration that we haven't considered yet. 
 Rather than over-specify a discovery info message type at the framework level 
 I think FrameworkInfo should expose a **very generic** way to supply metadata 
 for interested consumers (other processes, tasks, etc).
 Adding a Labels field to FrameworkInfo reuses an existing message type and 
 seems to fit well with the overall intent: attaching generic metadata to a 
 framework instance. These labels should be visible when querying a mesos 
 master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread James DeFelice (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580916#comment-14580916
 ] 

James DeFelice commented on MESOS-2841:
---

I think I like the idea of a shared info type that, perhaps, contains a 
Labels field (generic :) ). It feels more explicit: this is information that 
will be shared with mesos and/or mesos clients/frameworks/consumers/etc. 

I'm not certain about the visibility/scope of the Capability message - who is 
meant to consume this? It appears to be intended for mesos itself, not the 
wider cluster. I'm not so sure that stuffing Labels into Capability fits 
foreseeable use cases: I can imagine scenario(s) where an admin may want to 
label frameworks with metadata that has nothing to do with capabilities.

As you mentioned, webui_url also seems like a good candidate for inclusion in a 
shared info field.

 FrameworkInfo should include a Labels field to support arbitrary, lightweight 
 metadata
 --

 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Improvement
Reporter: James DeFelice
  Labels: mesosphere

 A framework instance may offer specific capabilities to the cluster: storage, 
 smartly-balanced request handling across deployed tasks, access to 3rd party 
 services outside of the cluster, etc. These capabilities may or may not be 
 utilized by all, or even most mesos clusters. However, it should be possible 
 for processes running in the cluster to discover capabilities or features of 
 frameworks in order to achieve a higher level of functionality and a more 
 seamless integration experience across the cluster.
 A rich discovery API attached to the FrameworkInfo could result in some form 
 of early lock-in: there are probably many ways to realize cross-framework 
 integration and external services integration that we haven't considered yet. 
 Rather than over-specify a discovery info message type at the framework level 
 I think FrameworkInfo should expose a **very generic** way to supply metadata 
 for interested consumers (other processes, tasks, etc).
 Adding a Labels field to FrameworkInfo reuses an existing message type and 
 seems to fit well with the overall intent: attaching generic metadata to a 
 framework instance. These labels should be visible when querying a mesos 
 master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2297) Add authentication support for HTTP API

2015-06-10 Thread Isabel Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2297:
--
Story Points: 1

 Add authentication support for HTTP API
 ---

 Key: MESOS-2297
 URL: https://issues.apache.org/jira/browse/MESOS-2297
 Project: Mesos
  Issue Type: Task
Reporter: Vinod Kone
Assignee: Isabel Jimenez
  Labels: mesosphere

 To start with, we will only support basic http auth.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2293) Implement the Call endpoint on master

2015-06-10 Thread Isabel Jimenez (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Isabel Jimenez updated MESOS-2293:
--
Issue Type: Story  (was: Task)

 Implement the Call endpoint on master
 -

 Key: MESOS-2293
 URL: https://issues.apache.org/jira/browse/MESOS-2293
 Project: Mesos
  Issue Type: Story
Reporter: Vinod Kone
Assignee: Isabel Jimenez
  Labels: mesosphere





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2841) FrameworkInfo should include a Labels field to support arbitrary, lightweight metadata

2015-06-10 Thread Niklas Quarfot Nielsen (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14580896#comment-14580896
 ] 

Niklas Quarfot Nielsen commented on MESOS-2841:
---

Great idea [~jdef]! Let us take a look at how this would look like.

Changing those during reregistration may need a special path and we need to 
make sure they are accessible through the state endpoints.

We introduce the notion of webui_url - maybe they should both be encapsulated 
in a SharedFrameworkInfo message (without any ties to the name - feel free to 
come up with another über message to encapsulate this in).

We also have a 'framework capability' message now (introduced during the 
oversubscription work). Would it make sense to put in there perhaps?

I have no strong preference at this point, but just exploring different 
approaches :)

 FrameworkInfo should include a Labels field to support arbitrary, lightweight 
 metadata
 --

 Key: MESOS-2841
 URL: https://issues.apache.org/jira/browse/MESOS-2841
 Project: Mesos
  Issue Type: Improvement
Reporter: James DeFelice

 A framework instance may offer specific capabilities to the cluster: storage, 
 smartly-balanced request handling across deployed tasks, access to 3rd party 
 services outside of the cluster, etc. These capabilities may or may not be 
 utilized by all, or even most mesos clusters. However, it should be possible 
 for processes running in the cluster to discover capabilities or features of 
 frameworks in order to achieve a higher level of functionality and a more 
 seamless integration experience across the cluster.
 A rich discovery API attached to the FrameworkInfo could result in some form 
 of early lock-in: there are probably many ways to realize cross-framework 
 integration and external services integration that we haven't considered yet. 
 Rather than over-specify a discovery info message type at the framework level 
 I think FrameworkInfo should expose a **very generic** way to supply metadata 
 for interested consumers (other processes, tasks, etc).
 Adding a Labels field to FrameworkInfo reuses an existing message type and 
 seems to fit well with the overall intent: attaching generic metadata to a 
 framework instance. These labels should be visible when querying a mesos 
 master's state.json endpoint.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2844) Add and document new labels field to framework info

2015-06-10 Thread Niklas Quarfot Nielsen (JIRA)
Niklas Quarfot Nielsen created MESOS-2844:
-

 Summary: Add and document new labels field to framework info
 Key: MESOS-2844
 URL: https://issues.apache.org/jira/browse/MESOS-2844
 Project: Mesos
  Issue Type: Improvement
Reporter: Niklas Quarfot Nielsen


Add and document new labels field to framework info:

{code}
message FrameworkInfo {
  // Used to determine the Unix user that an executor or task should
  // be launched as. If the user field is set to an empty string Mesos
  // will automagically set it to the current user.
  required string user = 1;

  // Name of the framework that shows up in the Mesos Web UI.
  required string name = 2;

  // Note that 'id' is only available after a framework has
  // registered, however, it is included here in order to facilitate
  // scheduler failover (i.e., if it is set then the
  // MesosSchedulerDriver expects the scheduler is performing
  // failover).
  optional FrameworkID id = 3;

  ...

  // This field allows a framework to advertise its set of
  // capabilities (e.g., ability to receive offers for revocable
  // resources).
  repeated Capability capabilities = 10;

  optional Labels labels = 11;
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (MESOS-2835) Fix typos in source comments

2015-06-10 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2835?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone resolved MESOS-2835.
---
   Resolution: Fixed
Fix Version/s: 0.23.0

commit 584887b6158b4a70aeaefd2b22499a7f4fa8bb7e
Author: Aditi Dixit aditi9di...@gmail.com
Date:   Wed Jun 10 13:58:14 2015 -0700

Fixed some typos.

Review: https://reviews.apache.org/r/35291


 Fix typos in source comments
 

 Key: MESOS-2835
 URL: https://issues.apache.org/jira/browse/MESOS-2835
 Project: Mesos
  Issue Type: Bug
  Components: technical debt
Reporter: Niklas Quarfot Nielsen
Assignee: Aditi Dixit
Priority: Trivial
  Labels: newbie
 Fix For: 0.23.0


 Review request https://reviews.apache.org/r/13709 carried a bunch of good 
 typo fixes, but has got stalled due to other code fixes and lack of 
 attention. We should create a typo review request and get these in.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2852) Remove stout/lambda.hpp now that C++11 is required.

2015-06-10 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-2852:
--

 Summary: Remove stout/lambda.hpp now that C++11 is required.
 Key: MESOS-2852
 URL: https://issues.apache.org/jira/browse/MESOS-2852
 Project: Mesos
  Issue Type: Task
  Components: stout, technical debt
Reporter: Benjamin Mahler


Now that we no longer need to pull in tr1, let's remove stout/lambda.hpp. It 
looks like std::function has already crept its way into the libprocess source, 
but the mesos source still uses {{lambda::}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2853) Report per-container metrics from host egress filter

2015-06-10 Thread Paul Brett (JIRA)
Paul Brett created MESOS-2853:
-

 Summary: Report per-container metrics from host egress filter
 Key: MESOS-2853
 URL: https://issues.apache.org/jira/browse/MESOS-2853
 Project: Mesos
  Issue Type: Improvement
  Components: isolation, twitter
Reporter: Paul Brett
Assignee: Paul Brett


Export in statistics.json the fq_codel flow statistics for each container.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2838) In Resources JSON model() resources of the same name overwrite each other.

2015-06-10 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581191#comment-14581191
 ] 

Yan Xu commented on MESOS-2838:
---

[~haosd...@gmail.com] Yes they are coalesced.

 In Resources JSON model() resources of the same name overwrite each other.
 --

 Key: MESOS-2838
 URL: https://issues.apache.org/jira/browse/MESOS-2838
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu
Assignee: Yan Xu
  Labels: twitter

 As shown here: 
 https://github.com/apache/mesos/blob/8559d7b7356ec91795e564767588c6f4519653a5/src/common/http.cpp#L50
 So if there are two cpus of different roles, whichever comes later will 
 overwrite the previous.
 We should instead aggregate different resources of the same name.
 However, in the presence of revocable resources, in order to maintain 
 backwards compatibility we should exclude revocable resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2854) Resources::parse(...) allows different resources of the same name to have different types.

2015-06-10 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-2854:
--
Sprint: Twitter Mesos Q2 Sprint 5

 Resources::parse(...) allows different resources of the same name to have 
 different types.
 --

 Key: MESOS-2854
 URL: https://issues.apache.org/jira/browse/MESOS-2854
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu
Assignee: Yan Xu
  Labels: twitter

 So code like this doesn't raise Error.
 {code}
 Resources::parse(foo(role1):1;foo(role2):[0-1])
 {code}
 Doesn't look like allowing this adds value and this complicates resource 
 maths/validation/reporting.
 We should disallow this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2838) In Resources JSON model() resources of the same name overwrite each other.

2015-06-10 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu reassigned MESOS-2838:
-

Assignee: Yan Xu

 In Resources JSON model() resources of the same name overwrite each other.
 --

 Key: MESOS-2838
 URL: https://issues.apache.org/jira/browse/MESOS-2838
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu
Assignee: Yan Xu
  Labels: twitter

 As shown here: 
 https://github.com/apache/mesos/blob/8559d7b7356ec91795e564767588c6f4519653a5/src/common/http.cpp#L50
 So if there are two cpus of different roles, whichever comes later will 
 overwrite the previous.
 We should instead aggregate different resources of the same name.
 However, in the presence of revocable resources, in order to maintain 
 backwards compatibility we should exclude revocable resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2838) In Resources JSON model() resources of the same name overwrite each other.

2015-06-10 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-2838:
--
Labels: twitter  (was: )

 In Resources JSON model() resources of the same name overwrite each other.
 --

 Key: MESOS-2838
 URL: https://issues.apache.org/jira/browse/MESOS-2838
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu
Assignee: Yan Xu
  Labels: twitter

 As shown here: 
 https://github.com/apache/mesos/blob/8559d7b7356ec91795e564767588c6f4519653a5/src/common/http.cpp#L50
 So if there are two cpus of different roles, whichever comes later will 
 overwrite the previous.
 We should instead aggregate different resources of the same name.
 However, in the presence of revocable resources, in order to maintain 
 backwards compatibility we should exclude revocable resources.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2855) Update operational guide to include growing from standalone to high availability

2015-06-10 Thread Michael Schenck (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581174#comment-14581174
 ] 

Michael Schenck commented on MESOS-2855:


I have a proposed fix in a local branch, and am working on the ReviewBoard part 
of submitting it.

 Update operational guide to include growing from standalone to high 
 availability
 

 Key: MESOS-2855
 URL: https://issues.apache.org/jira/browse/MESOS-2855
 Project: Mesos
  Issue Type: Documentation
Reporter: Michael Schenck
  Labels: documentation

 The [Operational 
 Guide|http://mesos.apache.org/documentation/latest/operational-guide/] covers 
 increasing quorum size from {{--quorum=2}}, but does not cover how to move 
 from a _standalone_ master to a high availability configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (MESOS-2821) Document and consolidate qdisc handles

2015-06-10 Thread Paul Brett (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paul Brett closed MESOS-2821.
-

 Document and consolidate qdisc handles
 --

 Key: MESOS-2821
 URL: https://issues.apache.org/jira/browse/MESOS-2821
 Project: Mesos
  Issue Type: Improvement
Reporter: Paul Brett
Assignee: Paul Brett
  Labels: twitter

 The structure of traffic control qdiscs and filters in non-trivial with the 
 knowledge of which handles are the parents of which filters or qdiscs are in 
 the create and recovery functions and will be needed to collect statistics on 
 the links.  Lets pull out the constants and document them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2854) Resources::parse(...) allows different resources of the same name to have different types.

2015-06-10 Thread Yan Xu (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581168#comment-14581168
 ] 

Yan Xu commented on MESOS-2854:
---

https://reviews.apache.org/r/35327/

 Resources::parse(...) allows different resources of the same name to have 
 different types.
 --

 Key: MESOS-2854
 URL: https://issues.apache.org/jira/browse/MESOS-2854
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu
Assignee: Yan Xu
  Labels: twitter

 So code like this doesn't raise Error.
 {code}
 Resources::parse(foo(role1):1;foo(role2):[0-1])
 {code}
 Doesn't look like allowing this adds value and this complicates resource 
 maths/validation/reporting.
 We should disallow this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2855) Update operational guide to include growing from standalone to high availability

2015-06-10 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2855:
--
Shepherd: Vinod Kone  (was: Michael Schenck)

 Update operational guide to include growing from standalone to high 
 availability
 

 Key: MESOS-2855
 URL: https://issues.apache.org/jira/browse/MESOS-2855
 Project: Mesos
  Issue Type: Documentation
Reporter: Michael Schenck
Assignee: Michael Schenck
  Labels: documentation

 The [Operational 
 Guide|http://mesos.apache.org/documentation/latest/operational-guide/] covers 
 increasing quorum size from {{--quorum=2}}, but does not cover how to move 
 from a _standalone_ master to a high availability configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2855) Update operational guide to include growing from standalone to high availability

2015-06-10 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2855:
--
Assignee: Michael Schenck

 Update operational guide to include growing from standalone to high 
 availability
 

 Key: MESOS-2855
 URL: https://issues.apache.org/jira/browse/MESOS-2855
 Project: Mesos
  Issue Type: Documentation
Reporter: Michael Schenck
Assignee: Michael Schenck
  Labels: documentation

 The [Operational 
 Guide|http://mesos.apache.org/documentation/latest/operational-guide/] covers 
 increasing quorum size from {{--quorum=2}}, but does not cover how to move 
 from a _standalone_ master to a high availability configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2854) Resources::parse(...) allows different resources of the same name to have different types.

2015-06-10 Thread Yan Xu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yan Xu updated MESOS-2854:
--
Labels: twitter  (was: )

 Resources::parse(...) allows different resources of the same name to have 
 different types.
 --

 Key: MESOS-2854
 URL: https://issues.apache.org/jira/browse/MESOS-2854
 Project: Mesos
  Issue Type: Bug
Reporter: Yan Xu
Assignee: Yan Xu
  Labels: twitter

 So code like this doesn't raise Error.
 {code}
 Resources::parse(foo(role1):1;foo(role2):[0-1])
 {code}
 Doesn't look like allowing this adds value and this complicates resource 
 maths/validation/reporting.
 We should disallow this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2855) Update operational guide to include growing from standalone to high availability

2015-06-10 Thread Michael Schenck (JIRA)
Michael Schenck created MESOS-2855:
--

 Summary: Update operational guide to include growing from 
standalone to high availability
 Key: MESOS-2855
 URL: https://issues.apache.org/jira/browse/MESOS-2855
 Project: Mesos
  Issue Type: Documentation
Reporter: Michael Schenck


The [Operational 
Guide|http://mesos.apache.org/documentation/latest/operational-guide/] covers 
increasing quorum size from {{--quorum=2}}, but does not cover how to move from 
a _standalone_ master to a high availability configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2855) Update operational guide to include growing from standalone to high availability

2015-06-10 Thread Michael Schenck (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581166#comment-14581166
 ] 

Michael Schenck commented on MESOS-2855:


It doesn't appear that I can assign this to myself

 Update operational guide to include growing from standalone to high 
 availability
 

 Key: MESOS-2855
 URL: https://issues.apache.org/jira/browse/MESOS-2855
 Project: Mesos
  Issue Type: Documentation
Reporter: Michael Schenck
  Labels: documentation

 The [Operational 
 Guide|http://mesos.apache.org/documentation/latest/operational-guide/] covers 
 increasing quorum size from {{--quorum=2}}, but does not cover how to move 
 from a _standalone_ master to a high availability configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2855) Update operational guide to include growing from standalone to high availability

2015-06-10 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2855:
--
Shepherd: Vinod Kone  (was: Vinod Kone)

 Update operational guide to include growing from standalone to high 
 availability
 

 Key: MESOS-2855
 URL: https://issues.apache.org/jira/browse/MESOS-2855
 Project: Mesos
  Issue Type: Documentation
Reporter: Michael Schenck
Assignee: Michael Schenck
  Labels: documentation

 The [Operational 
 Guide|http://mesos.apache.org/documentation/latest/operational-guide/] covers 
 increasing quorum size from {{--quorum=2}}, but does not cover how to move 
 from a _standalone_ master to a high availability configuration.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2753) Master should validate tasks using oversubscribed resources

2015-06-10 Thread Vinod Kone (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2753?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinod Kone updated MESOS-2753:
--
Story Points: 3  (was: 2)

 Master should validate tasks using oversubscribed resources
 ---

 Key: MESOS-2753
 URL: https://issues.apache.org/jira/browse/MESOS-2753
 Project: Mesos
  Issue Type: Task
  Components: isolation, master
Affects Versions: 0.23.0
Reporter: Ian Downes
Assignee: Vinod Kone
  Labels: twitter
 Fix For: 0.23.0


 Current implementation out for [review|https://reviews.apache.org/r/34310] 
 only supports setting the priority of containers with revocable CPU if it's 
 specified in the initial executor info resources. This should be enforced at 
 the master.
 Also master should make sure that oversubscribed resources used by the task 
 are valid.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2856) fully functional webui requires all masters+slaves to be routable by browser

2015-06-10 Thread Oliver Nicholas (JIRA)
Oliver Nicholas created MESOS-2856:
--

 Summary: fully functional webui requires all masters+slaves to be 
routable by browser
 Key: MESOS-2856
 URL: https://issues.apache.org/jira/browse/MESOS-2856
 Project: Mesos
  Issue Type: Bug
Reporter: Oliver Nicholas
Assignee: Oliver Nicholas


The core issue is that the Mesos web UI doesn't play well behind a proxy.  This 
is closely related to MESOS-1822, but I propose a different mechanism to 
resolve it.

The mesos webui expects that it can both: 
 # redirect your browser to the {{--hostname}} registered for another master if 
the master you're currently hitting isn't the leader (this is MESOS-1822)
 # redirect you directly to the {{--hostname}} of any slave (eg/ to look at 
sandbox logs)

But I don't want that to be possible really - fundamentally, mesos-master's 
webui is not appropriate for direct exposure to the Internet; in many 
enterprise cases it's not appropriate for wide exposure on a VPN either.  It 
has no access controls. And those machines shouldn't really have public IPs or 
DNS entries anyway.

The way I want to resolve this for my enterprise is to have a _single_ hostname 
(as in URL component, not literally unix host), with public DNS that can handle 
the entire webui experience (aside: the vhost this hits has a authentication 
middleware to handle the access control part).  In my case, the vhost is load 
balanced randomly across the mesos master quorum machines, and some slightly 
complex nginx rules allow it to effectively proxy any request to the actual 
leader master regardless of where the request goes.  This works nicely.

However, the mesos webui also attempts to route directly to slave machines  
with JSONP requests (eg/ as i mentioned above, to load the sandbox stdout for a 
task).  My slave machines don't even have public IPs.

Given this existing code for generating the parameters for URLs to slaves:
{code:javascript}
var pid = $scope.slaves[$routeParams.slave_id].pid;
var hostname = $scope.slaves[$routeParams.slave_id].hostname;
var id = pid.substring(0, pid.indexOf('@'));
var host = hostname + : + pid.substring(pid.lastIndexOf(':') + 1);
{code}

I'd like to implement a {{--enable-webui-proxy-urls}} option on the master that 
would change behavior in the webui from generating a slave URL as:
{code:javascript}
var url = '//' + host + '/files/browse.json?jsonp=JSON_CALLBACK';
{code}

to instead generate the URL as:
{code:javascript}
var url = '/proxy?host=' + encodeURIComponent(host) + 'orig=' 
encodeURIComponent('/files/browse.json?jsonp=JSON_CALLBACK');
{code}

and then nginx can take care of rewriting/reissuing the request and proxying 
the results back to the browser.

All of this could be obviated and probably made more elegant by implementing 
MESOS-2130/MESOS-2131 but..this seems like a simple change to make.  Any 
thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1856) Support specifying libnl3 install location.

2015-06-10 Thread Roger Ignazio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14581464#comment-14581464
 ] 

Roger Ignazio commented on MESOS-1856:
--

I hit this today when trying to build Mesos 0.22.0 with the network isolator on 
RHEL 7 and Ubuntu 14.04. The latest version of libnl3 in the package repos is 
3.2.21, which means that we need to build libnl3 from source. By default, it 
installs its header files to {{/usr/local/include/libnl3}}.

The surprising thing is that, if libnl3 isn't present anywhere on the system, 
the configure script fails like we'd expect it to. But if we've installed it to 
its default location (in {{/usr/local}}), the configure step completes 
successfully, even if nothing _actually_ works:
{code}
checking netlink/netlink.h usability... no
checking netlink/netlink.h presence... no
checking for netlink/netlink.h... no
checking libnl3/netlink/netlink.h usability... no
checking libnl3/netlink/netlink.h presence... no
checking for libnl3/netlink/netlink.h... no
checking for rtnl_link_veth_add in -lnl-route-3... yes
checking netlink/route/link/veth.h usability... no
checking netlink/route/link/veth.h presence... no
checking for netlink/route/link/veth.h... no
checking libnl3/netlink/route/link/veth.h usability... no
checking libnl3/netlink/route/link/veth.h presence... no
checking for libnl3/netlink/route/link/veth.h... no
{code}

Eventually, Make just bails out:
{code}
libtool: compile:  g++ -DPACKAGE_NAME=\mesos\ -DPACKAGE_TARNAME=\mesos\ 
-DPACKAGE_VERSION=\0.22.0\ -DPACKAGE_STRING=\mesos 0.22.0\ 
-DPACKAGE_BUGREPORT=\\ -DPACKAGE_URL=\\ -DPACKAGE=\mesos\ 
-DVERSION=\0.22.0\ -DSTDC_HEADERS=1 -DHAVE_SYS_TYPES_H=1 -DHAVE_SYS_STAT_H=1 
-DHAVE_STDLIB_H=1 -DHAVE_STRING_H=1 -DHAVE_MEMORY_H=1 -DHAVE_STRINGS_H=1 
-DHAVE_INTTYPES_H=1 -DHAVE_STDINT_H=1 -DHAVE_UNISTD_H=1 -DHAVE_DLFCN_H=1 
-DLT_OBJDIR=\.libs/\ -DHAVE_PTHREAD=1 -DHAVE_LIBZ=1 -DHAVE_LIBCURL=1 
-DHAVE_APR_POOLS_H=1 -DHAVE_LIBAPR_1=1 -DHAVE_SVN_VERSION_H=1 
-DHAVE_LIBSVN_SUBR_1=1 -DHAVE_SVN_DELTA_H=1 -DHAVE_LIBSVN_DELTA_1=1 
-DHAVE_LIBSASL2=1 -DHAVE_LIBNL_3=1 -DHAVE_LIBNL_ROUTE_3=1 
-DHAVE_LIBNL_IDIAG_3=1 -DWITH_NETWORK_ISOLATOR=1 -DMESOS_HAS_JAVA=1 
-DHAVE_PYTHON=\2.7\ -DMESOS_HAS_PYTHON=1 -I. -Wall -Werror 
-DLIBDIR=\/usr/local/lib\ -DPKGLIBEXECDIR=\/usr/local/libexec/mesos\ 
-DPKGDATADIR=\/usr/local/share/mesos\ -I../include 
-I../3rdparty/libprocess/include 
-I../3rdparty/libprocess/3rdparty/stout/include -I../include -I../include/mesos 
-I../3rdparty/libprocess/3rdparty/boost-1.53.0 
-I../3rdparty/libprocess/3rdparty/picojson-4f93734 
-I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
-I../3rdparty/libprocess/3rdparty/glog-0.3.3/src 
-I../3rdparty/libprocess/3rdparty/glog-0.3.3/src -I../3rdparty/leveldb/include 
-I../3rdparty/zookeeper-3.4.5/src/c/include 
-I../3rdparty/zookeeper-3.4.5/src/c/generated -I/usr/include/libnl3 
-I../3rdparty/libprocess/3rdparty/protobuf-2.5.0/src 
-I/usr/include/subversion-1 -I/usr/include/apr-1 -I/usr/include/apr-1.0 
-pthread -g1 -O0 -Wno-unused-local-typedefs -std=c++11 -MT 
linux/routing/libmesos_no_3rdparty_la-route.lo -MD -MP -MF 
linux/routing/.deps/libmesos_no_3rdparty_la-route.Tpo -c 
linux/routing/route.cpp  -fPIC -DPIC -o 
linux/routing/.libs/libmesos_no_3rdparty_la-route.o
linux/routing/route.cpp:23:26: fatal error: netlink/addr.h: No such file or 
directory
{code}

A sane-enough workaround would be to simply override the variable assignment at 
build time, which I think is preferable to having a user build libnl3 with 
{{--prefix=/usr}} (IMO, anyway).
{code}
./configure --with-network-isolator
make LIBNL_CFLAGS=-I/usr/local/include/libnl3
{code}

So if this ticket does get bumped from the next release (currently targeted at 
0.23.0), we could at least update the docs with a workaround. I think we might 
want to re-categorize this as a bug though.

 Support specifying libnl3 install location.
 ---

 Key: MESOS-1856
 URL: https://issues.apache.org/jira/browse/MESOS-1856
 Project: Mesos
  Issue Type: Task
Affects Versions: 0.22.0, 0.22.1
Reporter: Jie Yu

 LIBNL_CFLAGS uses a hard-coded path in the configure script, instead of 
 detecting the location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-1856) Support specifying libnl3 install location.

2015-06-10 Thread Roger Ignazio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Ignazio updated MESOS-1856:
-
Description: LIBNL_CFLAGS uses a hard-coded path in the configure script, 
instead of detecting the location.

 Support specifying libnl3 install location.
 ---

 Key: MESOS-1856
 URL: https://issues.apache.org/jira/browse/MESOS-1856
 Project: Mesos
  Issue Type: Task
Affects Versions: 0.22.0, 0.22.1
Reporter: Jie Yu

 LIBNL_CFLAGS uses a hard-coded path in the configure script, instead of 
 detecting the location.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2857) FetcherCacheTest.LocalCachedExtract is flaky.

2015-06-10 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-2857:
--

 Summary: FetcherCacheTest.LocalCachedExtract is flaky.
 Key: MESOS-2857
 URL: https://issues.apache.org/jira/browse/MESOS-2857
 Project: Mesos
  Issue Type: Bug
  Components: fetcher, test
Reporter: Benjamin Mahler
Assignee: Bernd Mathiske


From jenkins:

{noformat}
[ RUN  ] FetcherCacheTest.LocalCachedExtract
Using temporary directory '/tmp/FetcherCacheTest_LocalCachedExtract_Cwdcdj'
I0610 20:04:48.591573 24561 leveldb.cpp:176] Opened db in 3.512525ms
I0610 20:04:48.592456 24561 leveldb.cpp:183] Compacted db in 828630ns
I0610 20:04:48.592512 24561 leveldb.cpp:198] Created db iterator in 32992ns
I0610 20:04:48.592531 24561 leveldb.cpp:204] Seeked to beginning of db in 8967ns
I0610 20:04:48.592545 24561 leveldb.cpp:273] Iterated through 0 keys in the db 
in 7762ns
I0610 20:04:48.592604 24561 replica.cpp:744] Replica recovered with log 
positions 0 - 0 with 1 holes and 0 unlearned
I0610 20:04:48.593438 24587 recover.cpp:449] Starting replica recovery
I0610 20:04:48.593698 24587 recover.cpp:475] Replica is in EMPTY status
I0610 20:04:48.595641 24580 replica.cpp:641] Replica in EMPTY status received a 
broadcasted recover request
I0610 20:04:48.596086 24590 recover.cpp:195] Received a recover response from a 
replica in EMPTY status
I0610 20:04:48.596607 24590 recover.cpp:566] Updating replica status to STARTING
I0610 20:04:48.597507 24590 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 717888ns
I0610 20:04:48.597535 24590 replica.cpp:323] Persisted replica status to 
STARTING
I0610 20:04:48.597697 24590 recover.cpp:475] Replica is in STARTING status
I0610 20:04:48.599165 24584 replica.cpp:641] Replica in STARTING status 
received a broadcasted recover request
I0610 20:04:48.599434 24584 recover.cpp:195] Received a recover response from a 
replica in STARTING status
I0610 20:04:48.599915 24590 recover.cpp:566] Updating replica status to VOTING
I0610 20:04:48.600545 24590 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 432335ns
I0610 20:04:48.600574 24590 replica.cpp:323] Persisted replica status to VOTING
I0610 20:04:48.600659 24590 recover.cpp:580] Successfully joined the Paxos group
I0610 20:04:48.600797 24590 recover.cpp:464] Recover process terminated
I0610 20:04:48.602905 24594 master.cpp:363] Master 
20150610-200448-3875541420-32907-24561 (dbade881e927) started on 
172.17.0.231:32907
I0610 20:04:48.602957 24594 master.cpp:365] Flags at startup: --acls= 
--allocation_interval=1secs --allocator=HierarchicalDRF 
--authenticate=true --authenticate_slaves=true --authenticators=crammd5 
--credentials=/tmp/FetcherCacheTest_LocalCachedExtract_Cwdcdj/credentials 
--framework_sorter=drf --help=false --initialize_driver_logging=true 
--log_auto_initialize=true --logbufsecs=0 --logging_level=INFO 
--quiet=false --recovery_slave_removal_limit=100% 
--registry=replicated_log --registry_fetch_timeout=1mins 
--registry_store_timeout=25secs --registry_strict=true 
--root_submissions=true --slave_reregister_timeout=10mins 
--user_sorter=drf --version=false 
--webui_dir=/mesos/mesos-0.23.0/_inst/share/mesos/webui 
--work_dir=/tmp/FetcherCacheTest_LocalCachedExtract_Cwdcdj/master 
--zk_session_timeout=10secs
I0610 20:04:48.603374 24594 master.cpp:410] Master only allowing authenticated 
frameworks to register
I0610 20:04:48.603392 24594 master.cpp:415] Master only allowing authenticated 
slaves to register
I0610 20:04:48.603404 24594 credentials.hpp:37] Loading credentials for 
authentication from 
'/tmp/FetcherCacheTest_LocalCachedExtract_Cwdcdj/credentials'
I0610 20:04:48.603751 24594 master.cpp:454] Using default 'crammd5' 
authenticator
I0610 20:04:48.604928 24594 master.cpp:491] Authorization enabled
I0610 20:04:48.606034 24593 hierarchical.hpp:309] Initialized hierarchical 
allocator process
I0610 20:04:48.606106 24593 whitelist_watcher.cpp:79] No whitelist given
I0610 20:04:48.607430 24594 master.cpp:1476] The newly elected leader is 
master@172.17.0.231:32907 with id 20150610-200448-3875541420-32907-24561
I0610 20:04:48.607466 24594 master.cpp:1489] Elected as the leading master!
I0610 20:04:48.607481 24594 master.cpp:1259] Recovering from registrar
I0610 20:04:48.607712 24594 registrar.cpp:313] Recovering registrar
I0610 20:04:48.608543 24588 log.cpp:661] Attempting to start the writer
I0610 20:04:48.610231 24588 replica.cpp:477] Replica received implicit promise 
request with proposal 1
I0610 20:04:48.611335 24588 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 1.086439ms
I0610 20:04:48.611382 24588 replica.cpp:345] Persisted promised to 1
I0610 20:04:48.612303 24588 coordinator.cpp:230] Coordinator attemping to fill 
missing position
I0610 20:04:48.613883 24593 replica.cpp:378] Replica received explicit promise 
request for position 0 with proposal 2
I0610 20:04:48.619205 24593 leveldb.cpp

[jira] [Updated] (MESOS-1856) Support specifying libnl3 install location.

2015-06-10 Thread Roger Ignazio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Roger Ignazio updated MESOS-1856:
-
Affects Version/s: 0.22.0
   0.22.1

 Support specifying libnl3 install location.
 ---

 Key: MESOS-1856
 URL: https://issues.apache.org/jira/browse/MESOS-1856
 Project: Mesos
  Issue Type: Task
Affects Versions: 0.22.0, 0.22.1
Reporter: Jie Yu





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-2858) FetcherCacheHttpTest.HttpMixed is flaky.

2015-06-10 Thread Benjamin Mahler (JIRA)
Benjamin Mahler created MESOS-2858:
--

 Summary: FetcherCacheHttpTest.HttpMixed is flaky.
 Key: MESOS-2858
 URL: https://issues.apache.org/jira/browse/MESOS-2858
 Project: Mesos
  Issue Type: Bug
  Components: fetcher, test
Reporter: Benjamin Mahler
Assignee: Bernd Mathiske


From jenkins:

{noformat}
[ RUN  ] FetcherCacheHttpTest.HttpMixed
Using temporary directory '/tmp/FetcherCacheHttpTest_HttpMixed_qfpOOC'
I0611 00:40:28.208909 26042 leveldb.cpp:176] Opened db in 3.831173ms
I0611 00:40:28.209951 26042 leveldb.cpp:183] Compacted db in 997319ns
I0611 00:40:28.210011 26042 leveldb.cpp:198] Created db iterator in 23917ns
I0611 00:40:28.210032 26042 leveldb.cpp:204] Seeked to beginning of db in 2112ns
I0611 00:40:28.210043 26042 leveldb.cpp:273] Iterated through 0 keys in the db 
in 392ns
I0611 00:40:28.210095 26042 replica.cpp:744] Replica recovered with log 
positions 0 - 0 with 1 holes and 0 unlearned
I0611 00:40:28.210741 26067 recover.cpp:449] Starting replica recovery
I0611 00:40:28.211144 26067 recover.cpp:475] Replica is in EMPTY status
I0611 00:40:28.212210 26074 replica.cpp:641] Replica in EMPTY status received a 
broadcasted recover request
I0611 00:40:28.212728 26071 recover.cpp:195] Received a recover response from a 
replica in EMPTY status
I0611 00:40:28.213260 26069 recover.cpp:566] Updating replica status to STARTING
I0611 00:40:28.214066 26073 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 590673ns
I0611 00:40:28.214095 26073 replica.cpp:323] Persisted replica status to 
STARTING
I0611 00:40:28.214350 26073 recover.cpp:475] Replica is in STARTING status
I0611 00:40:28.214774 26061 master.cpp:363] Master 
20150611-004028-1946161580-33349-26042 (658ddc752264) started on 
172.17.0.116:33349
I0611 00:40:28.214800 26061 master.cpp:365] Flags at startup: --acls= 
--allocation_interval=1secs --allocator=HierarchicalDRF 
--authenticate=true --authenticate_slaves=true --authenticators=crammd5 
--credentials=/tmp/FetcherCacheHttpTest_HttpMixed_qfpOOC/credentials 
--framework_sorter=drf --help=false --initialize_driver_logging=true 
--log_auto_initialize=true --logbufsecs=0 --logging_level=INFO 
--quiet=false --recovery_slave_removal_limit=100% 
--registry=replicated_log --registry_fetch_timeout=1mins 
--registry_store_timeout=25secs --registry_strict=true 
--root_submissions=true --slave_reregister_timeout=10mins 
--user_sorter=drf --version=false 
--webui_dir=/mesos/mesos-0.23.0/_inst/share/mesos/webui 
--work_dir=/tmp/FetcherCacheHttpTest_HttpMixed_qfpOOC/master 
--zk_session_timeout=10secs
I0611 00:40:28.215342 26061 master.cpp:410] Master only allowing authenticated 
frameworks to register
I0611 00:40:28.215361 26061 master.cpp:415] Master only allowing authenticated 
slaves to register
I0611 00:40:28.215397 26061 credentials.hpp:37] Loading credentials for 
authentication from '/tmp/FetcherCacheHttpTest_HttpMixed_qfpOOC/credentials'
I0611 00:40:28.215589 26064 replica.cpp:641] Replica in STARTING status 
received a broadcasted recover request
I0611 00:40:28.215770 26061 master.cpp:454] Using default 'crammd5' 
authenticator
I0611 00:40:28.215934 26061 master.cpp:491] Authorization enabled
I0611 00:40:28.215932 26062 recover.cpp:195] Received a recover response from a 
replica in STARTING status
I0611 00:40:28.216256 26070 whitelist_watcher.cpp:79] No whitelist given
I0611 00:40:28.216310 26066 hierarchical.hpp:309] Initialized hierarchical 
allocator process
I0611 00:40:28.216352 26067 recover.cpp:566] Updating replica status to VOTING
I0611 00:40:28.216909 26070 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 374189ns
I0611 00:40:28.216931 26070 replica.cpp:323] Persisted replica status to VOTING
I0611 00:40:28.217052 26075 recover.cpp:580] Successfully joined the Paxos group
I0611 00:40:28.217355 26063 master.cpp:1476] The newly elected leader is 
master@172.17.0.116:33349 with id 20150611-004028-1946161580-33349-26042
I0611 00:40:28.217512 26063 master.cpp:1489] Elected as the leading master!
I0611 00:40:28.217540 26063 master.cpp:1259] Recovering from registrar
I0611 00:40:28.217753 26070 registrar.cpp:313] Recovering registrar
I0611 00:40:28.217396 26075 recover.cpp:464] Recover process terminated
I0611 00:40:28.218341 26065 log.cpp:661] Attempting to start the writer
I0611 00:40:28.219391 26067 replica.cpp:477] Replica received implicit promise 
request with proposal 1
I0611 00:40:28.219696 26067 leveldb.cpp:306] Persisting metadata (8 bytes) to 
leveldb took 276905ns
I0611 00:40:28.219720 26067 replica.cpp:345] Persisted promised to 1
I0611 00:40:28.220255 26064 coordinator.cpp:230] Coordinator attemping to fill 
missing position
I0611 00:40:28.221247 26073 replica.cpp:378] Replica received explicit promise 
request for position 0 with proposal 2
I0611 00:40:28.221545 26073 leveldb.cpp:343] Persisting action (8 bytes)