[jira] [Commented] (MESOS-3370) Deprecate the external containerizer

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948121#comment-14948121
 ] 

Adam B commented on MESOS-3370:
---

[~tarnfeld] Are you ready to move off your external containerizer 
implementation?

> Deprecate the external containerizer
> 
>
> Key: MESOS-3370
> URL: https://issues.apache.org/jira/browse/MESOS-3370
> Project: Mesos
>  Issue Type: Task
>Reporter: Niklas Quarfot Nielsen
>
> To our knowledge, no one is using the external containerizer and we could 
> clean up code paths in the slave and containerizer interface (the dual 
> launch() signatures)
> In a deprecation cycle, we can move this code into a module (dependent on 
> containerizer modules landing) and from there, move it into it's own repo



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1225) Allow definition/use of shared resources

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948094#comment-14948094
 ] 

Adam B commented on MESOS-1225:
---

Related to the concept of "cluster-wide resources"

> Allow definition/use of shared resources
> 
>
> Key: MESOS-1225
> URL: https://issues.apache.org/jira/browse/MESOS-1225
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation, containerization, framework, isolation, slave
>Reporter: Tobias Weingartner
>Priority: Minor
>
> It would be nice to be able to define a set of shared resources for a set of 
> slaves (such as IP addresses, power, rack bandwidth, etc) that would be 
> managed by the master/slaves, and exported to the frameworks for their use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-1225) Allow definition/use of shared resources

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948094#comment-14948094
 ] 

Adam B edited comment on MESOS-1225 at 10/8/15 5:26 AM:


Related to the concept of "cluster-wide resources" in MESOS-2728


was (Author: adam-mesos):
Related to the concept of "cluster-wide resources"

> Allow definition/use of shared resources
> 
>
> Key: MESOS-1225
> URL: https://issues.apache.org/jira/browse/MESOS-1225
> Project: Mesos
>  Issue Type: Improvement
>  Components: allocation, containerization, framework, isolation, slave
>Reporter: Tobias Weingartner
>Priority: Minor
>
> It would be nice to be able to define a set of shared resources for a set of 
> slaves (such as IP addresses, power, rack bandwidth, etc) that would be 
> managed by the master/slaves, and exported to the frameworks for their use.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3605) hdfs.du() fails on os x due to lack of native-hadoop library.

2015-10-07 Thread alexius ludeman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alexius ludeman updated MESOS-3605:
---
Description: 
hdfs.du() fails on os x due to lack of native-hadoop library.

This requires a fix from https://issues.apache.org/jira/browse/MESOS-3602 
before it's reproducible.

per 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html
 os x does not have native library support, and I could not readily find a way 
to disable the message.  This causes hdfs.du() to fail when parsing the output 
with the extra warning message about the unavailable native-hadoop library.

W1007 21:47:38.362117 250429440 fetcher.cpp:451] Reverting to fetching directly 
into the sandbox for 
'hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
 due to failure to fetch through the cache, with error: Could not determine 
size of cache file for 
'lexinator@hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
 with error: Hadoop client could not determine size: HDFS du returned an 
unexpected number of results: '2015-10-07 21:47:37,474 WARN  [main] 
util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
10.4 M  
/a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz

  was:
hdfs.du() fails on os x due to lack of native-hadoop library.

This requires a fix from https://issues.apache.org/jira/browse/MESOS-3602 
before it's reproducible.

per 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html
 os x does not have native library support, and I could not readily find a way 
to disable the message.  This causes hdfs.du() to fail when parsing the output 
with the warning message.

W1007 21:47:38.362117 250429440 fetcher.cpp:451] Reverting to fetching directly 
into the sandbox for 
'hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
 due to failure to fetch through the cache, with error: Could not determine 
size of cache file for 
'lexinator@hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
 with error: Hadoop client could not determine size: HDFS du returned an 
unexpected number of results: '2015-10-07 21:47:37,474 WARN  [main] 
util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
10.4 M  
/a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz


> hdfs.du() fails on os x due to lack of native-hadoop library.
> -
>
> Key: MESOS-3605
> URL: https://issues.apache.org/jira/browse/MESOS-3605
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher, hadoop
>Affects Versions: 0.23.0
> Environment: os x
>Reporter: alexius ludeman
>
> hdfs.du() fails on os x due to lack of native-hadoop library.
> This requires a fix from https://issues.apache.org/jira/browse/MESOS-3602 
> before it's reproducible.
> per 
> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html
>  os x does not have native library support, and I could not readily find a 
> way to disable the message.  This causes hdfs.du() to fail when parsing the 
> output with the extra warning message about the unavailable native-hadoop 
> library.
> W1007 21:47:38.362117 250429440 fetcher.cpp:451] Reverting to fetching 
> directly into the sandbox for 
> 'hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
>  due to failure to fetch through the cache, with error: Could not determine 
> size of cache file for 
> 'lexinator@hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
>  with error: Hadoop client could not determine size: HDFS du returned an 
> unexpected number of results: '2015-10-07 21:47:37,474 WARN  [main] 
> util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> 10.4 M  
> /a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3605) hdfs.du() fails on os x due to lack of native-hadoop library.

2015-10-07 Thread alexius ludeman (JIRA)
alexius ludeman created MESOS-3605:
--

 Summary: hdfs.du() fails on os x due to lack of native-hadoop 
library.
 Key: MESOS-3605
 URL: https://issues.apache.org/jira/browse/MESOS-3605
 Project: Mesos
  Issue Type: Bug
  Components: fetcher, hadoop
Affects Versions: 0.23.0
 Environment: os x
Reporter: alexius ludeman


hdfs.du() fails on os x due to lack of native-hadoop library.

This requires a fix from https://issues.apache.org/jira/browse/MESOS-3602 
before it's reproducible.

per 
https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/NativeLibraries.html
 os x does not have native library support, and I could not readily find a way 
to disable the message.  This causes hdfs.du() to fail when parsing the output 
with the warning message.

W1007 21:47:38.362117 250429440 fetcher.cpp:451] Reverting to fetching directly 
into the sandbox for 
'hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
 due to failure to fetch through the cache, with error: Could not determine 
size of cache file for 
'lexinator@hdfs:///a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
 with error: Hadoop client could not determine size: HDFS du returned an 
unexpected number of results: '2015-10-07 21:47:37,474 WARN  [main] 
util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
10.4 M  
/a/path/artifacts/executor/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3417) Log source address replicated log recieved broadcasts

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948052#comment-14948052
 ] 

Adam B commented on MESOS-3417:
---

commit 575f3b352f176c9801f51f0e6f61cf5c3a02167c
Author: Neil Conway 
Date:   Wed Oct 7 21:20:38 2015 -0700

Added source address to logging when we receive replicated log events.

Fixes MESOS-3417.

Review: https://reviews.apache.org/r/39104

> Log source address replicated log recieved broadcasts
> -
>
> Key: MESOS-3417
> URL: https://issues.apache.org/jira/browse/MESOS-3417
> Project: Mesos
>  Issue Type: Improvement
>  Components: replicated log
>Affects Versions: 0.23.0, 0.24.0
> Environment: Mesos 0.23
>Reporter: Cody Maloney
>Assignee: Neil Conway
>Priority: Minor
>  Labels: mesosphere, newbie
>
> Currently Mesos doesn't log what machine a replicated log status broadcast 
> was recieved from:
> {code}
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.320164 15637 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> Sep 11 21:41:14 master-01 mesos-dns[15583]: I0911 21:41:14.321097   15583 
> detect.go:118] ignoring children-changed event, leader has not changed: /mesos
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.353914 15639 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.479132 15639 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> {code}
> It would be really useful for debugging replicated log startup issues to have 
> info about where the message came from (libprocess address, ip, or hostname) 
> the message came from



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2315) Deprecate / Remove CommandInfo::ContainerInfo

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14948037#comment-14948037
 ] 

Adam B commented on MESOS-2315:
---

Let's start with an email to the dev and user mailing lists to verify that 
nobody is using the old ContainerInfo. They shouldn't be, and if they are, we 
should show them how to use the new one.

> Deprecate / Remove CommandInfo::ContainerInfo
> -
>
> Key: MESOS-2315
> URL: https://issues.apache.org/jira/browse/MESOS-2315
> Project: Mesos
>  Issue Type: Task
>Reporter: Ian Downes
>Assignee: Vaibhav Khanduja
>Priority: Minor
>  Labels: mesosphere, newbie
>
> IIUC this has been deprecated and all current code (except 
> examples/docker_no_executor_framework.cpp) uses the top-level ContainerInfo?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-2315) Deprecate / Remove CommandInfo::ContainerInfo

2015-10-07 Thread Vaibhav Khanduja (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vaibhav Khanduja reassigned MESOS-2315:
---

Assignee: Vaibhav Khanduja

> Deprecate / Remove CommandInfo::ContainerInfo
> -
>
> Key: MESOS-2315
> URL: https://issues.apache.org/jira/browse/MESOS-2315
> Project: Mesos
>  Issue Type: Task
>Reporter: Ian Downes
>Assignee: Vaibhav Khanduja
>Priority: Minor
>  Labels: mesosphere, newbie
>
> IIUC this has been deprecated and all current code (except 
> examples/docker_no_executor_framework.cpp) uses the top-level ContainerInfo?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-1757) Speed up the tests.

2015-10-07 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-1757?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent reassigned MESOS-1757:
---

Assignee: haosdent

> Speed up the tests.
> ---
>
> Key: MESOS-1757
> URL: https://issues.apache.org/jira/browse/MESOS-1757
> Project: Mesos
>  Issue Type: Epic
>  Components: technical debt, test
>Reporter: Benjamin Mahler
>Assignee: haosdent
>  Labels: twitter
>
> The full test suite is exceeding the 9 minute mark (581 seconds on my 
> machine), this epic is to track techniques to improve this:
> # Now that the master and the slave have to perform sync'ed disk writes, 
> consider using tmpfs (e.g. under /dev/shm) to speed up the disk writes. For 
> the master, we could also consider defaulting to in-memory state rather than 
> the replicated log for most tests.
> # -The reaper takes a full second to reap an exited process (MESOS-1199), 
> this adds a second to each slave recovery test, and possibly more for things 
> that rely on Subprocess.-
> # The command executor sleeps for a second when shutting down (MESOS-442), 
> this adds a second to every test that uses the command executor.
> A big improvement will come from running the tests in parallel, a few options:
> # Use automake's parallel test harness to compile tests separately and run 
> tests in parallel (see 
> [here|http://www.gnu.org/software/automake/manual/html_node/Parallel-Test-Harness.html]).
> # Continue to use one test binary, but leverage google test's ability to 
> shard tests across processes/machines (see 
> [here|https://code.google.com/p/googletest/wiki/AdvancedGuide#Distributing_Test_Functions_to_Multiple_Machines]).
>  This entails writing our own test wrapper script in support to decide many 
> workers to use, etc. 
> [gtest-parallel|https://github.com/google/gtest-parallel/blob/master/gtest-parallel]
>  is an example of a parallel runner, but does not leverage the sharding 
> ability.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-2315) Deprecate / Remove CommandInfo::ContainerInfo

2015-10-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-2315:
--
Shepherd: Adam B
  Labels: mesosphere newbie  (was: )

> Deprecate / Remove CommandInfo::ContainerInfo
> -
>
> Key: MESOS-2315
> URL: https://issues.apache.org/jira/browse/MESOS-2315
> Project: Mesos
>  Issue Type: Task
>Reporter: Ian Downes
>Priority: Minor
>  Labels: mesosphere, newbie
>
> IIUC this has been deprecated and all current code (except 
> examples/docker_no_executor_framework.cpp) uses the top-level ContainerInfo?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2315) Deprecate / Remove CommandInfo::ContainerInfo

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947895#comment-14947895
 ] 

Adam B commented on MESOS-2315:
---

Yes, let's deprecate this! It's only confusing our newer framework developers.
I'm willing to shepherd.

> Deprecate / Remove CommandInfo::ContainerInfo
> -
>
> Key: MESOS-2315
> URL: https://issues.apache.org/jira/browse/MESOS-2315
> Project: Mesos
>  Issue Type: Task
>Reporter: Ian Downes
>Priority: Minor
>
> IIUC this has been deprecated and all current code (except 
> examples/docker_no_executor_framework.cpp) uses the top-level ContainerInfo?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3604) ExamplesTest.PersistentVolumeFramework does not work in OS X El Capitan

2015-10-07 Thread Michael Park (JIRA)
Michael Park created MESOS-3604:
---

 Summary: ExamplesTest.PersistentVolumeFramework does not work in 
OS X El Capitan
 Key: MESOS-3604
 URL: https://issues.apache.org/jira/browse/MESOS-3604
 Project: Mesos
  Issue Type: Bug
Reporter: Michael Park


The example persistent volume framework test does not pass in OS X El Capitan. 
It seems to be executing the {{/src/.libs/mesos-executor}} directly 
while it should be executing the wrapper script at 
{{/src/mesos-executor}} instead. The no-executor framework passes 
however, which seem to have a very similar configuration with the persistent 
volume framework. The following is the output that shows the {{dyld}} load 
error:

{noformat}
I1008 01:22:52.280140 4284416 launcher.cpp:132] Forked child with pid '1706' 
for contain
er 'b6d3bd96-2ebd-47b1-a16a-a22ffba992aa'
I1008 01:22:52.280300 4284416 containerizer.cpp:873] Checkpointing executor's 
forked pid
 1706 to 
'/var/folders/p6/nfxknpz52dzfc6zqnz23tq18gn/T/mesos-XX.5OZ3locB/0/meta/
slaves/34d6329e-69cb-4a72-aee4-fe892bf1c70b-S2/frameworks/34d6329e-69cb-4a72-aee4-fe892b
f1c70b-/executors/dec188d4-d2dc-40c5-ac4d-881adc3d81c0/runs/b6d3bd96-2ebd-47b1-a16a-
a22ffba992aa/pids/forked.pid'
dyld: Library not loaded: /usr/local/lib/libmesos-0.26.0.dylib
  Referenced from: /Users/mpark/Projects/mesos/build/src/.libs/mesos-executor
  Reason: image not found
dyld: Library not loaded: /usr/local/lib/libmesos-0.26.0.dylib
  Referenced from: /Users/mpark/Projects/mesos/build/src/.libs/mesos-executor
  Reason: image not found
dyld: Library not loaded: /usr/local/lib/libmesos-0.26.0.dylib
  Referenced from: /Users/mpark/Projects/mesos/build/src/.libs/mesos-executor
  Reason: image not found
I1008 01:22:52.365397 3211264 containerizer.cpp:1284] Executor for container 
'06b649be-88c8-4047-8fb5-e89bdd096b66' has exited
I1008 01:22:52.365433 3211264 containerizer.cpp:1097] Destroying container 
'06b649be-88c8-4047-8fb5-e89bdd096b66'
{noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3603) Test build failure due to comparison between signed and unsigned integers

2015-10-07 Thread Kapil Arya (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947802#comment-14947802
 ] 

Kapil Arya commented on MESOS-3603:
---

The fix is already in Master branch. Just need to cherry-pick into rc2.
bfeb070a2aef52f445eb057076d344fd184eb461


> Test build failure due to comparison between signed and unsigned integers
> -
>
> Key: MESOS-3603
> URL: https://issues.apache.org/jira/browse/MESOS-3603
> Project: Mesos
>  Issue Type: Bug
> Environment: {code}
> $ uname -a
> Linux thinkpad 4.1.6-3-desktop #1 SMP PREEMPT Fri Aug 28 10:59:34 UTC 2015 
> (d867e86) x86_64 x86_64 x86_64 GNU/Linux
> $ gcc --version
> gcc (SUSE Linux) 5.1.1 20150713 [gcc-5-branch revision 225736]
> Copyright (C) 2015 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.  There is NO
> warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
> $ /lib64/libc.so.6
> GNU C Library (GNU libc) stable release version 2.22 (git bbab82c25da9), by 
> Roland McGrath et al.
> Copyright (C) 2015 Free Software Foundation, Inc.
> This is free software; see the source for copying conditions.
> There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
> PARTICULAR PURPOSE.
> Configured for x86_64-suse-linux.
> Compiled by GNU CC version 5.1.1 20150713 [gcc-5-branch revision 225736].
> Available extensions:
> crypt add-on version 2.1 by Michael Glad and others
> GNU Libidn by Simon Josefsson
> Native POSIX Threads Library by Ulrich Drepper et al
> BIND-8.2.3-T5B
> libc ABIs: UNIQUE IFUNC
> For bug reporting instructions, please see:
> .
> $ cat /etc/SuSE-release
> openSUSE 20150909 (x86_64)
> VERSION = 20150909
> CODENAME = Tumbleweed
> # /etc/SuSE-release is deprecated and will be removed in the future, use 
> /etc/os-release instead
> {code}
>Reporter: Kapil Arya
>Assignee: Kapil Arya
>Priority: Blocker
>  Labels: mesosphere
>
> Compilation fails on OpenSUSE Tumbleweed (Linux 4.1.6, gcc 5.1.1, glibc 2.22) 
> with the following errors:
> {code}
> In file included from ../../src/tests/values_tests.cpp:22:0: 
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In 
> instantiatio
> n of ‘testing::AssertionResult testing::internal::CmpHelperEQ(const char*, 
> const char*, 
> const T1&, const T2&) [with T1 = int; T2 = long unsigned int]’: 
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1484:23:
>requi
> red from ‘static testing::AssertionResult 
> testing::internal::EqHelper l>::Compare(const char*, const char*, const T1&, const T2&) [with T1 = int; 
> T2 = long un
> signed int; bool lhs_is_null_literal = false]’ 
> ../../src/tests/values_tests.cpp:287:3:   required from here 
> ../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16:
>  error: 
> comparison between signed and unsigned integer expressions 
> [-Werror=sign-compare] 
>   if (expected == actual) { 
>^ 
>  CXX  tests/containerizer/mesos_tests-provisioner_docker_tests.o 
> ^CMakefile:6779: recipe for target 'tests/mesos_tests-values_tests.o' failed 
> make[3]: *** [tests/mesos_tests-values_tests.o] Interrupt
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3603) Test build failure due to comparison between signed and unsigned integers

2015-10-07 Thread Kapil Arya (JIRA)
Kapil Arya created MESOS-3603:
-

 Summary: Test build failure due to comparison between signed and 
unsigned integers
 Key: MESOS-3603
 URL: https://issues.apache.org/jira/browse/MESOS-3603
 Project: Mesos
  Issue Type: Bug
 Environment: {code}

$ uname -a
Linux thinkpad 4.1.6-3-desktop #1 SMP PREEMPT Fri Aug 28 10:59:34 UTC 2015 
(d867e86) x86_64 x86_64 x86_64 GNU/Linux

$ gcc --version
gcc (SUSE Linux) 5.1.1 20150713 [gcc-5-branch revision 225736]
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ /lib64/libc.so.6
GNU C Library (GNU libc) stable release version 2.22 (git bbab82c25da9), by 
Roland McGrath et al.
Copyright (C) 2015 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.
There is NO warranty; not even for MERCHANTABILITY or FITNESS FOR A
PARTICULAR PURPOSE.
Configured for x86_64-suse-linux.
Compiled by GNU CC version 5.1.1 20150713 [gcc-5-branch revision 225736].
Available extensions:
crypt add-on version 2.1 by Michael Glad and others
GNU Libidn by Simon Josefsson
Native POSIX Threads Library by Ulrich Drepper et al
BIND-8.2.3-T5B
libc ABIs: UNIQUE IFUNC
For bug reporting instructions, please see:
.

$ cat /etc/SuSE-release
openSUSE 20150909 (x86_64)
VERSION = 20150909
CODENAME = Tumbleweed
# /etc/SuSE-release is deprecated and will be removed in the future, use 
/etc/os-release instead
{code}
Reporter: Kapil Arya
Assignee: Kapil Arya
Priority: Blocker


Compilation fails on OpenSUSE Tumbleweed (Linux 4.1.6, gcc 5.1.1, glibc 2.22) 
with the following errors:

{code}
In file included from ../../src/tests/values_tests.cpp:22:0: 
../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h: In 
instantiatio
n of ‘testing::AssertionResult testing::internal::CmpHelperEQ(const char*, 
const char*, 
const T1&, const T2&) [with T1 = int; T2 = long unsigned int]’: 
../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1484:23:
   requi
red from ‘static testing::AssertionResult 
testing::internal::EqHelper::Compare(const char*, const char*, const T1&, const T2&) [with T1 = int; T2 
= long un
signed int; bool lhs_is_null_literal = false]’ 
../../src/tests/values_tests.cpp:287:3:   required from here 
../3rdparty/libprocess/3rdparty/gmock-1.7.0/gtest/include/gtest/gtest.h:1448:16:
 error: 
comparison between signed and unsigned integer expressions 
[-Werror=sign-compare] 
  if (expected == actual) { 
   ^ 
 CXX  tests/containerizer/mesos_tests-provisioner_docker_tests.o 
^CMakefile:6779: recipe for target 'tests/mesos_tests-values_tests.o' failed 
make[3]: *** [tests/mesos_tests-values_tests.o] Interrupt
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3553) LIBPROCESS_IP not passed when executor's environment is specified

2015-10-07 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947708#comment-14947708
 ] 

Greg Mann commented on MESOS-3553:
--

Currently investigating the executor code to see if any communication is 
necessary other than to/from the agent. If not, then the executor's process can 
always just link to the loopback interface and we can prevent any hostname 
lookup.

> LIBPROCESS_IP not passed when executor's environment is specified
> -
>
> Key: MESOS-3553
> URL: https://issues.apache.org/jira/browse/MESOS-3553
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.24.1
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> When the executor's environment is specified explicitly via 
> {{\-\-executor_environment_variables}}, {{LIBPROCESS_IP}} will not be passed, 
> leading to errors in some cases - for example, when no DNS is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3553) LIBPROCESS_IP not passed when executor's environment is specified

2015-10-07 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-3553:
-
Story Points: 2  (was: 3)

> LIBPROCESS_IP not passed when executor's environment is specified
> -
>
> Key: MESOS-3553
> URL: https://issues.apache.org/jira/browse/MESOS-3553
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.24.1
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> When the executor's environment is specified explicitly via 
> {{\-\-executor_environment_variables}}, {{LIBPROCESS_IP}} will not be passed, 
> leading to errors in some cases - for example, when no DNS is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3553) LIBPROCESS_IP not passed when executor's environment is specified

2015-10-07 Thread Greg Mann (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3553?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Greg Mann updated MESOS-3553:
-
Shepherd: Adam B
  Sprint: Mesosphere Sprint 20
Story Points: 3
  Labels: mesosphere  (was: )

> LIBPROCESS_IP not passed when executor's environment is specified
> -
>
> Key: MESOS-3553
> URL: https://issues.apache.org/jira/browse/MESOS-3553
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.24.1
>Reporter: Greg Mann
>Assignee: Greg Mann
>  Labels: mesosphere
>
> When the executor's environment is specified explicitly via 
> {{\-\-executor_environment_variables}}, {{LIBPROCESS_IP}} will not be passed, 
> leading to errors in some cases - for example, when no DNS is available.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3417) Log source address replicated log recieved broadcasts

2015-10-07 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-3417:
--
  Sprint: Mesosphere Sprint 20
Story Points: 2

> Log source address replicated log recieved broadcasts
> -
>
> Key: MESOS-3417
> URL: https://issues.apache.org/jira/browse/MESOS-3417
> Project: Mesos
>  Issue Type: Improvement
>  Components: replicated log
>Affects Versions: 0.23.0, 0.24.0
> Environment: Mesos 0.23
>Reporter: Cody Maloney
>Assignee: Neil Conway
>Priority: Minor
>  Labels: mesosphere, newbie
>
> Currently Mesos doesn't log what machine a replicated log status broadcast 
> was recieved from:
> {code}
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.320164 15637 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> Sep 11 21:41:14 master-01 mesos-dns[15583]: I0911 21:41:14.321097   15583 
> detect.go:118] ignoring children-changed event, leader has not changed: /mesos
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.353914 15639 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.479132 15639 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> {code}
> It would be really useful for debugging replicated log startup issues to have 
> info about where the message came from (libprocess address, ip, or hostname) 
> the message came from



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3417) Log source address replicated log recieved broadcasts

2015-10-07 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947650#comment-14947650
 ] 

Neil Conway commented on MESOS-3417:


https://reviews.apache.org/r/39104/

> Log source address replicated log recieved broadcasts
> -
>
> Key: MESOS-3417
> URL: https://issues.apache.org/jira/browse/MESOS-3417
> Project: Mesos
>  Issue Type: Improvement
>  Components: replicated log
>Affects Versions: 0.23.0, 0.24.0
> Environment: Mesos 0.23
>Reporter: Cody Maloney
>Assignee: Neil Conway
>Priority: Minor
>  Labels: mesosphere, newbie
>
> Currently Mesos doesn't log what machine a replicated log status broadcast 
> was recieved from:
> {code}
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.320164 15637 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> Sep 11 21:41:14 master-01 mesos-dns[15583]: I0911 21:41:14.321097   15583 
> detect.go:118] ignoring children-changed event, leader has not changed: /mesos
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.353914 15639 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> Sep 11 21:41:14 master-01 mesos-master[15625]: I0911 21:41:14.479132 15639 
> replica.cpp:641] Replica in EMPTY status received a broadcasted recover 
> request
> {code}
> It would be really useful for debugging replicated log startup issues to have 
> info about where the message came from (libprocess address, ip, or hostname) 
> the message came from



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3602) hdfs du fails due to prepended / on path

2015-10-07 Thread alexius ludeman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alexius ludeman updated MESOS-3602:
---
Summary: hdfs du fails due to prepended / on path  (was: hdfs du fails due 
to prepend / on path)

> hdfs du fails due to prepended / on path
> 
>
> Key: MESOS-3602
> URL: https://issues.apache.org/jira/browse/MESOS-3602
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 0.23.0
>Reporter: alexius ludeman
>
> hdfs.hpp du() fails to run.  It appears to prepend "/" but the path passed in 
> is a uri of something like "hdfs:///a/path/to/artifact.tar.gz".
> W1007 13:46:25.791894 373116928 fetcher.cpp:436] Reverting to fetching 
> directly into the sandbox for 
> 'hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
>  due to failure to fetch through the cache, with error: Could not determine 
> size of cache file for 
> 'lexinator@hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
>  with error: Hadoop client could not determine size: HDFS du returned an 
> unexpected number of results: '2015-10-07 13:46:21,958 WARN  [main] 
> util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> -du: java.net.URISyntaxException: Expected scheme-specific part at index 5: 
> hdfs:
> Usage: hadoop fs [generic options] -du [-s] [-h]  ...
> The command it's running is:
> /usr/bin/env bash /.../hadoop fs -du -h 
> /hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3602) hdfs du fails due to prepend / on path

2015-10-07 Thread alexius ludeman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alexius ludeman updated MESOS-3602:
---
Summary: hdfs du fails due to prepend / on path  (was: hdfs du fails)

> hdfs du fails due to prepend / on path
> --
>
> Key: MESOS-3602
> URL: https://issues.apache.org/jira/browse/MESOS-3602
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 0.23.0
>Reporter: alexius ludeman
>
> hdfs.hpp du() fails to run.  It appears to prepend "/" but the path passed in 
> is a uri of something like "hdfs:///a/path/to/artifact.tar.gz".
> W1007 13:46:25.791894 373116928 fetcher.cpp:436] Reverting to fetching 
> directly into the sandbox for 
> 'hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
>  due to failure to fetch through the cache, with error: Could not determine 
> size of cache file for 
> 'lexinator@hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
>  with error: Hadoop client could not determine size: HDFS du returned an 
> unexpected number of results: '2015-10-07 13:46:21,958 WARN  [main] 
> util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> -du: java.net.URISyntaxException: Expected scheme-specific part at index 5: 
> hdfs:
> Usage: hadoop fs [generic options] -du [-s] [-h]  ...
> The command it's running is:
> /usr/bin/env bash /.../hadoop fs -du -h 
> /hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3602) hdfs du fails

2015-10-07 Thread alexius ludeman (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

alexius ludeman updated MESOS-3602:
---
Description: 
hdfs.hpp du() fails to run.  It appears to prepend "/" but the path passed in 
is a uri of something like "hdfs:///a/path/to/artifact.tar.gz".

W1007 13:46:25.791894 373116928 fetcher.cpp:436] Reverting to fetching directly 
into the sandbox for 
'hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
 due to failure to fetch through the cache, with error: Could not determine 
size of cache file for 
'lexinator@hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
 with error: Hadoop client could not determine size: HDFS du returned an 
unexpected number of results: '2015-10-07 13:46:21,958 WARN  [main] 
util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
-du: java.net.URISyntaxException: Expected scheme-specific part at index 5: 
hdfs:
Usage: hadoop fs [generic options] -du [-s] [-h]  ...

The command it's running is:
/usr/bin/env bash /.../hadoop fs -du -h 
/hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz

  was:
hdfs.hpp du() fails to run.  It appears to blindly prepend "/" but the path 
passed in is a uri of something like "hdfs:///a/path/to/artifact.tar.gz".

W1007 13:46:25.791894 373116928 fetcher.cpp:436] Reverting to fetching directly 
into the sandbox for 
'hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
 due to failure to fetch through the cache, with error: Could not determine 
size of cache file for 
'lexinator@hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
 with error: Hadoop client could not determine size: HDFS du returned an 
unexpected number of results: '2015-10-07 13:46:21,958 WARN  [main] 
util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
-du: java.net.URISyntaxException: Expected scheme-specific part at index 5: 
hdfs:
Usage: hadoop fs [generic options] -du [-s] [-h]  ...

The command it's running is:
/usr/bin/env bash /.../hadoop fs -du -h 
/hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz


> hdfs du fails
> -
>
> Key: MESOS-3602
> URL: https://issues.apache.org/jira/browse/MESOS-3602
> Project: Mesos
>  Issue Type: Bug
>  Components: fetcher
>Affects Versions: 0.23.0
>Reporter: alexius ludeman
>
> hdfs.hpp du() fails to run.  It appears to prepend "/" but the path passed in 
> is a uri of something like "hdfs:///a/path/to/artifact.tar.gz".
> W1007 13:46:25.791894 373116928 fetcher.cpp:436] Reverting to fetching 
> directly into the sandbox for 
> 'hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
>  due to failure to fetch through the cache, with error: Could not determine 
> size of cache file for 
> 'lexinator@hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
>  with error: Hadoop client could not determine size: HDFS du returned an 
> unexpected number of results: '2015-10-07 13:46:21,958 WARN  [main] 
> util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
> native-hadoop library for your platform... using builtin-java classes where 
> applicable
> -du: java.net.URISyntaxException: Expected scheme-specific part at index 5: 
> hdfs:
> Usage: hadoop fs [generic options] -du [-s] [-h]  ...
> The command it's running is:
> /usr/bin/env bash /.../hadoop fs -du -h 
> /hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3602) hdfs du fails

2015-10-07 Thread alexius ludeman (JIRA)
alexius ludeman created MESOS-3602:
--

 Summary: hdfs du fails
 Key: MESOS-3602
 URL: https://issues.apache.org/jira/browse/MESOS-3602
 Project: Mesos
  Issue Type: Bug
  Components: fetcher
Affects Versions: 0.23.0
Reporter: alexius ludeman


hdfs.hpp du() fails to run.  It appears to blindly prepend "/" but the path 
passed in is a uri of something like "hdfs:///a/path/to/artifact.tar.gz".

W1007 13:46:25.791894 373116928 fetcher.cpp:436] Reverting to fetching directly 
into the sandbox for 
'hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz',
 due to failure to fetch through the cache, with error: Could not determine 
size of cache file for 
'lexinator@hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz'
 with error: Hadoop client could not determine size: HDFS du returned an 
unexpected number of results: '2015-10-07 13:46:21,958 WARN  [main] 
util.NativeCodeLoader (NativeCodeLoader.java:(62)) - Unable to load 
native-hadoop library for your platform... using builtin-java classes where 
applicable
-du: java.net.URISyntaxException: Expected scheme-specific part at index 5: 
hdfs:
Usage: hadoop fs [generic options] -du [-s] [-h]  ...

The command it's running is:
/usr/bin/env bash /.../hadoop fs -du -h 
/hdfs:///a/path/to/3.3.2-SNAPSHOT/executor-3.3.2-SNAPSHOT-artifact-with-dependencies-archive.tar.gz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947523#comment-14947523
 ] 

Erhan Kesken commented on MESOS-3599:
-

I think second way seems best option, keeping compatibility with old health 
check commands, prevents confusion, simplifying workaround code without 
breaking old ones.

only drawback is disturbing 0.23.1 and 0.24.1 users, but probably they are 
minority, if you can distrubute 0.23.2 and 0.24.2 relesases fast for this 
issue, people may not be aware of this temporary behavior change.


> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3601) Formalize all headers and metadata for HTTP API Event Stream

2015-10-07 Thread Ben Whitehead (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947516#comment-14947516
 ] 

Ben Whitehead commented on MESOS-3601:
--

As I understand it, part of the motivation for RecordIO was to be tolerant of 
proxies (1.0 or 1.1) between the scheduler and mesos (Since these proxies 
according to the http spec can change chunk sizes).  Being tolerant of a 1.0 
proxy would require the connection semantics to be explicitly defined rather 
than depending on default handling.

You are correct that keep-alive can be used to express intent for handling 
request pipelining which we probably don't want for the specific socket the 
event stream is flowing over, but it could be useful when other Calls are being 
sent to the server.

In the case of the event stream itself {{Connection: close}} is probably more 
appropriate.

> Formalize all headers and metadata for HTTP API Event Stream
> 
>
> Key: MESOS-3601
> URL: https://issues.apache.org/jira/browse/MESOS-3601
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 0.24.0
> Environment: Mesos 0.24.0
>Reporter: Ben Whitehead
>  Labels: api, http, wireprotocol
>
> From and HTTP standpoint the current set of headers returned when connecting 
> to the HTTP scheduler API are insufficient. 
> {code:title=current headers}
> HTTP/1.1 200 OK
> Transfer-Encoding: chunked
> Date: Wed, 30 Sep 2015 21:07:16 GMT
> Content-Type: application/json
> {code}
> Since the response from mesos is intended to function as a stream 
> {{Connection: keep-alive}} should be specified so that the connection can 
> remain open.
> If RecordIO is going to be applied to the messages, the headers should 
> include the information necessary for a client to be able to detect RecordIO 
> and setup it response handlers appropriately.
> How RecordIO is expressed will come down to the semantics of what is actually 
> "Returned" as the response from {{POST /api/v1/scheduler}}.
> h4. Proposal
> One approach would be to leverage http as much as possible, having a client 
> specify an {{Accept-Encoding}} along with the {{Accept}} header to indicate 
> that it can handle RecordIO {{Content-Encoding}} of {{Content-Type}} 
> messages.  (This approach allows for things like gzip to be woven in fairly 
> easily in the future)
> For this approach I would expect the following:
> {code:title=Request}
> POST /api/v1/scheduler HTTP/1.1
> Host: localhost:5050
> Accept: application/x-protobuf
> Accept-Encoding: recordio
> Content-Type: application/x-protobuf
> Content-Length: 35
> User-Agent: RxNetty Client
> {code}
> {code:title=Response}
> HTTP/1.1 200 OK
> Connection: keep-alive
> Transfer-Encoding: chunked
> Content-Type: application/x-protobuf
> Content-Encoding: recordio
> Cache-Control: no-transform
> {code}
> When Content-Encoding is used it is recommended to set {{Cache-Control: 
> no-transform}} to signal to any proxies that no transformation should be 
> applied to the the content encoding [Section 14.11 RFC 
> 2616|http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3519) Fix file descriptor leakage / double close in the code base

2015-10-07 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947485#comment-14947485
 ] 

Benjamin Mahler commented on MESOS-3519:


[~chzhcn] looks like we have to fix the code using fts_open as well:

https://github.com/apache/mesos/blob/0.25.0-rc2/3rdparty/libprocess/3rdparty/stout/include/stout/posix/os.hpp#L233
https://github.com/apache/mesos/blob/0.25.0-rc2/src/linux/cgroups.cpp#L911

Both of these have some returns where the call to fts_close is missing.

> Fix file descriptor leakage / double close in the code base
> ---
>
> Key: MESOS-3519
> URL: https://issues.apache.org/jira/browse/MESOS-3519
> Project: Mesos
>  Issue Type: Bug
>Reporter: Chi Zhang
>Assignee: Chi Zhang
> Fix For: 0.25.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-2467) Allow --resources flag to take JSON.

2015-10-07 Thread Greg Mann (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-2467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947422#comment-14947422
 ] 

Greg Mann commented on MESOS-2467:
--

Documentation review here: https://reviews.apache.org/r/39102/

> Allow --resources flag to take JSON.
> 
>
> Key: MESOS-2467
> URL: https://issues.apache.org/jira/browse/MESOS-2467
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Jie Yu
>Assignee: Greg Mann
>  Labels: mesosphere
>
> Currently, we used a customized format for --resources flag. As we introduce 
> more and more stuffs (e.g., persistence, reservation) in Resource object, we 
> need a more generic way to specify --resources.
> For backward compatibility, we can scan the first character. If it is '[', 
> then we invoke the JSON parser. Otherwise, we use the existing parser.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3601) Formalize all headers and metadata for HTTP API Event Stream

2015-10-07 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947395#comment-14947395
 ] 

Benjamin Mahler commented on MESOS-3601:


Hm... the {{Connection: keep-alive}} header is for HTTP 1.0; all connections in 
HTTP 1.1 are persistent by default.

It would actually be more appropriate for us to use HTTP 1.1's {{Connection: 
close}} header to prevent the caller from trying to pipeline on this 
connection, since the stream response is infinite. IIUC the {{Connection}} 
header has to do with whether multiple requests can occur on the same 
connection, not sure how you're interpreting its meaning..?

> Formalize all headers and metadata for HTTP API Event Stream
> 
>
> Key: MESOS-3601
> URL: https://issues.apache.org/jira/browse/MESOS-3601
> Project: Mesos
>  Issue Type: Improvement
>Affects Versions: 0.24.0
> Environment: Mesos 0.24.0
>Reporter: Ben Whitehead
>  Labels: api, http, wireprotocol
>
> From and HTTP standpoint the current set of headers returned when connecting 
> to the HTTP scheduler API are insufficient. 
> {code:title=current headers}
> HTTP/1.1 200 OK
> Transfer-Encoding: chunked
> Date: Wed, 30 Sep 2015 21:07:16 GMT
> Content-Type: application/json
> {code}
> Since the response from mesos is intended to function as a stream 
> {{Connection: keep-alive}} should be specified so that the connection can 
> remain open.
> If RecordIO is going to be applied to the messages, the headers should 
> include the information necessary for a client to be able to detect RecordIO 
> and setup it response handlers appropriately.
> How RecordIO is expressed will come down to the semantics of what is actually 
> "Returned" as the response from {{POST /api/v1/scheduler}}.
> h4. Proposal
> One approach would be to leverage http as much as possible, having a client 
> specify an {{Accept-Encoding}} along with the {{Accept}} header to indicate 
> that it can handle RecordIO {{Content-Encoding}} of {{Content-Type}} 
> messages.  (This approach allows for things like gzip to be woven in fairly 
> easily in the future)
> For this approach I would expect the following:
> {code:title=Request}
> POST /api/v1/scheduler HTTP/1.1
> Host: localhost:5050
> Accept: application/x-protobuf
> Accept-Encoding: recordio
> Content-Type: application/x-protobuf
> Content-Length: 35
> User-Agent: RxNetty Client
> {code}
> {code:title=Response}
> HTTP/1.1 200 OK
> Connection: keep-alive
> Transfer-Encoding: chunked
> Content-Type: application/x-protobuf
> Content-Encoding: recordio
> Cache-Control: no-transform
> {code}
> When Content-Encoding is used it is recommended to set {{Cache-Control: 
> no-transform}} to signal to any proxies that no transformation should be 
> applied to the the content encoding [Section 14.11 RFC 
> 2616|http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3601) Formalize all headers and metadata for HTTP API Event Stream

2015-10-07 Thread Ben Whitehead (JIRA)
Ben Whitehead created MESOS-3601:


 Summary: Formalize all headers and metadata for HTTP API Event 
Stream
 Key: MESOS-3601
 URL: https://issues.apache.org/jira/browse/MESOS-3601
 Project: Mesos
  Issue Type: Improvement
Affects Versions: 0.24.0
 Environment: Mesos 0.24.0
Reporter: Ben Whitehead


>From and HTTP standpoint the current set of headers returned when connecting 
>to the HTTP scheduler API are insufficient. 
{code:title=current headers}
HTTP/1.1 200 OK
Transfer-Encoding: chunked
Date: Wed, 30 Sep 2015 21:07:16 GMT
Content-Type: application/json
{code}

Since the response from mesos is intended to function as a stream {{Connection: 
keep-alive}} should be specified so that the connection can remain open.

If RecordIO is going to be applied to the messages, the headers should include 
the information necessary for a client to be able to detect RecordIO and setup 
it response handlers appropriately.

How RecordIO is expressed will come down to the semantics of what is actually 
"Returned" as the response from {{POST /api/v1/scheduler}}.

h4. Proposal
One approach would be to leverage http as much as possible, having a client 
specify an {{Accept-Encoding}} along with the {{Accept}} header to indicate 
that it can handle RecordIO {{Content-Encoding}} of {{Content-Type}} messages.  
(This approach allows for things like gzip to be woven in fairly easily in the 
future)

For this approach I would expect the following:
{code:title=Request}
POST /api/v1/scheduler HTTP/1.1
Host: localhost:5050
Accept: application/x-protobuf
Accept-Encoding: recordio
Content-Type: application/x-protobuf
Content-Length: 35
User-Agent: RxNetty Client
{code}
{code:title=Response}
HTTP/1.1 200 OK
Connection: keep-alive
Transfer-Encoding: chunked
Content-Type: application/x-protobuf
Content-Encoding: recordio
Cache-Control: no-transform
{code}

When Content-Encoding is used it is recommended to set {{Cache-Control: 
no-transform}} to signal to any proxies that no transformation should be 
applied to the the content encoding [Section 14.11 RFC 
2616|http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11].





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3566) Add a section to the Scheduler HTTP API docs around RecordIO specification

2015-10-07 Thread Anand Mazumdar (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anand Mazumdar updated MESOS-3566:
--
Description: 
Since the {{RecordIO}} format is not that widely used, searching for it online 
does not offer much help. 
- It would be good if we can add to the docs, a small section on its 
specification for framework developers. 
- Also, add details on why {{RecordIO}} format is being used and why just using 
vanilla {{ChunkedEncoding}} and encode one event per chunk won't suffice.
- Bonus points, if we can have a simple code snippet in C++/Java on reading a 
{{RecordIO}} response to help developers.

  was:
Since the {{RecordIO}} format is not that widely used, searching for it online 
does not offer much help. 
- It would be good if we can add to the docs, a small section on its 
specification for framework developers. 
- Bonus points, if we can have a simple code snippet in C++/Java on reading a 
{{RecordIO}} response to help developers.


> Add a section to the Scheduler HTTP API docs around RecordIO specification
> --
>
> Key: MESOS-3566
> URL: https://issues.apache.org/jira/browse/MESOS-3566
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>  Labels: mesosphere
>
> Since the {{RecordIO}} format is not that widely used, searching for it 
> online does not offer much help. 
> - It would be good if we can add to the docs, a small section on its 
> specification for framework developers. 
> - Also, add details on why {{RecordIO}} format is being used and why just 
> using vanilla {{ChunkedEncoding}} and encode one event per chunk won't 
> suffice.
> - Bonus points, if we can have a simple code snippet in C++/Java on reading a 
> {{RecordIO}} response to help developers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3566) Add a section to the Scheduler HTTP API docs around RecordIO specification

2015-10-07 Thread Ben Whitehead (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947324#comment-14947324
 ] 

Ben Whitehead commented on MESOS-3566:
--

The documentation for the HTTP API should also explain why RecordIO is being 
used. What issue is it addressing?

> Add a section to the Scheduler HTTP API docs around RecordIO specification
> --
>
> Key: MESOS-3566
> URL: https://issues.apache.org/jira/browse/MESOS-3566
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>  Labels: mesosphere
>
> Since the {{RecordIO}} format is not that widely used, searching for it 
> online does not offer much help. 
> - It would be good if we can add to the docs, a small section on its 
> specification for framework developers. 
> - Bonus points, if we can have a simple code snippet in C++/Java on reading a 
> {{RecordIO}} response to help developers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-1113) Refactor cgroup interface in preparation for Systemd NWO.

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947290#comment-14947290
 ] 

Adam B commented on MESOS-1113:
---

[~jvanremoortere] is this issue a duplicate of MESOS-3425 ?

> Refactor cgroup interface in preparation for Systemd NWO.
> -
>
> Key: MESOS-1113
> URL: https://issues.apache.org/jira/browse/MESOS-1113
> Project: Mesos
>  Issue Type: Story
>  Components: containerization
>Affects Versions: 0.19.0
>Reporter: Timothy St. Clair
>Assignee: Timothy Chen
>  Labels: cgroups, mesosphere
>
> In coming releases cgroups will no longer have it's own interface,  all 
> interactions will go through systemd's DBUS interface:
> http://www.freedesktop.org/wiki/Software/systemd/ControlGroupInterface/
> This ticket is to track and allow the refactoring and migration that will be 
> required in order to support.  
> - Original Message -
> > From: "Lennart Poettering" 
> > To: "Development discussions related to Fedora" 
> > 
> > Cc: "Fedora Big Data SIG" 
> > Sent: Monday, March 17, 2014 8:18:37 PM
> > Subject: Re: Systemd & cgroups & NWO
> > 
> > Well, the nebulous choice of words is intended, since we don't want to
> > make specific promises on time-frames...
> > 
> > The APIs described (tersely) at the end of the wiki page describe the
> > status quo with systemd 211.
> > 
> > The "single-writer" cgroup tree stuff Tejun has been working on for the
> > kernel is now working on his machine, but it's not pushed upstream and
> > will take a while before it will hit Fedora.
> > 
> > At this point in time you hence still may create cgroups directly
> > yourself (but only if you follow the pax cgroup document), however, we
> > strongly encourage you to instead use scopes/slices to create them, as
> > discussed on the wiki page. This way the cgroups transition will be
> > abstracted away from you. You have control of a number of knobs that
> > systemd will expose for you, such as CPUShares=, BlockIOWeight= and so
> > on, but this is not complete, and primarily so because it's not clear
> > that those other properties will continue to exist the way they are in
> > the kernel. To read statistics data or to write knobs that systemd
> > doesn't cover you need to go directly to the cgroupfs. For that, simply
> > read /proc/self/cgroup to find out your own cgroup, and then operate on
> > that. However, as during the single-writer cgroup transition the kernel
> > interface how we set things up will change, be prepared that things
> > might break...
> > 
> > Lennart
> > 
> > --
> > Lennart Poettering, Red Hat



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3566) Add a section to the Scheduler HTTP API docs around RecordIO specification

2015-10-07 Thread Ben Whitehead (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947282#comment-14947282
 ] 

Ben Whitehead commented on MESOS-3566:
--

I would prefer there to be some documentation on RecordIO independent of the 
http api itself, this will make it much easier to locate the spec later and 
make it referenceable by other things.

Having something as explicitly spec'd as http chunked transfer encoding would 
be preferable: Similar to how http chunked transfer encoding is spec'd out: 
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.6.1

> Add a section to the Scheduler HTTP API docs around RecordIO specification
> --
>
> Key: MESOS-3566
> URL: https://issues.apache.org/jira/browse/MESOS-3566
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Anand Mazumdar
>  Labels: mesosphere
>
> Since the {{RecordIO}} format is not that widely used, searching for it 
> online does not offer much help. 
> - It would be good if we can add to the docs, a small section on its 
> specification for framework developers. 
> - Bonus points, if we can have a simple code snippet in C++/Java on reading a 
> {{RecordIO}} response to help developers.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3560) JSON-based credential files do not work correctly

2015-10-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3560:
--
Target Version/s: 0.26.0
   Fix Version/s: (was: 0.26.0)

> JSON-based credential files do not work correctly
> -
>
> Key: MESOS-3560
> URL: https://issues.apache.org/jira/browse/MESOS-3560
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Michael Park
>Assignee: Isabel Jimenez
>  Labels: mesosphere
>
> Specifying the following credentials file:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “password”
> }
>   ]
> }
> {code}
> Then hitting a master endpoint with:
> {code}
> curl -i -u “user:password” ...
> {code}
> Does not work. This is contrary to the text-based credentials file which 
> works:
> {code}
> user password
> {code}
> Currently, the password in a JSON-based credentials file needs to be 
> base64-encoded in order for it to work:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “cGFzc3dvcmQ=”
> }
>   ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3599:
--
Fix Version/s: (was: 0.23.1)
   (was: 0.24.1)
   (was: 0.25.0)

> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3600) unable to build with non-default protobuf

2015-10-07 Thread James Peach (JIRA)
James Peach created MESOS-3600:
--

 Summary: unable to build with non-default protobuf
 Key: MESOS-3600
 URL: https://issues.apache.org/jira/browse/MESOS-3600
 Project: Mesos
  Issue Type: Bug
  Components: build
Reporter: James Peach


If I install a custom protobuf into {{/opt/protobuf}}, I should be able to pass 
{{--with-protobuf=/opt/protobuf}} to configure the build to use it.

On OS X, this fails:
{code}
...
checking google/protobuf/message.h usability... yes
checking google/protobuf/message.h presence... yes
checking for google/protobuf/message.h... yes
checking for _init in -lprotobuf... no
configure: error: cannot find protobuf
---
You have requested the use of a non-bundled protobuf but no suitable
protobuf could be found.

You may want specify the location of protobuf by providing a prefix
path via --with-protobuf=DIR, or check that the path you provided is
correct if youre already doing this.
---
{code}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3268) apply-review.sh crashes with non ASCII char

2015-10-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3268:
--
Shepherd: Michael Park
  Labels: mesosphere  (was: )

> apply-review.sh crashes with non ASCII char
> ---
>
> Key: MESOS-3268
> URL: https://issues.apache.org/jira/browse/MESOS-3268
> Project: Mesos
>  Issue Type: Bug
>  Components: reviewbot
>Reporter: José Guilherme Vanz
>Assignee: Gastón Kleiman
>Priority: Minor
>  Labels: mesosphere
>
> There is an issue in the apply-review script when user name field has some 
> non ascii char. E.g.:
> Bad patch!
> Reviews applied: [37468]
> Failed command: ./support/apply-review.sh -n -r 37468
> Error:
>  2015-08-14 04:22:30 URL:https://reviews.apache.org/r/37468/diff/raw/ 
> [23334/23334] -> "37468.patch" [1]
> Traceback (most recent call last):
>   File "./support/jsonurl.py", line 25, in 
> print data
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 
> 3: ordinal not in range(128)
> Successfully applied: Removed allocation types to mesos::master namespace
> The allocation-related types was moved to mesos::master namespace.
> MESOS-2516
> Review: https://reviews.apache.org/r/37468
> fatal: empty ident name (for ) not allowed
> Failed to commit patch
> In this example, the problem was caused because the full name of the user has 
> the "é" char. For simulate the problem you can run the following shell script:
> `AUTHOR_NAME=$(./support/jsonurl.py 
> https://reviews.apache.org/api/users/jvanz/ user fullname)`
> (This is my user, I removed the non ascii char to send more patches)
> The problem is when the result of the python script is kept in the variable. 
> If you call the python script without store the result in a variable 
> everything works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3268) apply-review.sh crashes with non ASCII char

2015-10-07 Thread Adam B (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Adam B updated MESOS-3268:
--
Assignee: Gastón Kleiman

> apply-review.sh crashes with non ASCII char
> ---
>
> Key: MESOS-3268
> URL: https://issues.apache.org/jira/browse/MESOS-3268
> Project: Mesos
>  Issue Type: Bug
>  Components: reviewbot
>Reporter: José Guilherme Vanz
>Assignee: Gastón Kleiman
>Priority: Minor
>
> There is an issue in the apply-review script when user name field has some 
> non ascii char. E.g.:
> Bad patch!
> Reviews applied: [37468]
> Failed command: ./support/apply-review.sh -n -r 37468
> Error:
>  2015-08-14 04:22:30 URL:https://reviews.apache.org/r/37468/diff/raw/ 
> [23334/23334] -> "37468.patch" [1]
> Traceback (most recent call last):
>   File "./support/jsonurl.py", line 25, in 
> print data
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 
> 3: ordinal not in range(128)
> Successfully applied: Removed allocation types to mesos::master namespace
> The allocation-related types was moved to mesos::master namespace.
> MESOS-2516
> Review: https://reviews.apache.org/r/37468
> fatal: empty ident name (for ) not allowed
> Failed to commit patch
> In this example, the problem was caused because the full name of the user has 
> the "é" char. For simulate the problem you can run the following shell script:
> `AUTHOR_NAME=$(./support/jsonurl.py 
> https://reviews.apache.org/api/users/jvanz/ user fullname)`
> (This is my user, I removed the non ascii char to send more patches)
> The problem is when the result of the python script is kept in the variable. 
> If you call the python script without store the result in a variable 
> everything works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3268) apply-review.sh crashes with non ASCII char

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947217#comment-14947217
 ] 

Adam B commented on MESOS-3268:
---

https://reviews.apache.org/r/39087/

> apply-review.sh crashes with non ASCII char
> ---
>
> Key: MESOS-3268
> URL: https://issues.apache.org/jira/browse/MESOS-3268
> Project: Mesos
>  Issue Type: Bug
>  Components: reviewbot
>Reporter: José Guilherme Vanz
>Priority: Minor
>
> There is an issue in the apply-review script when user name field has some 
> non ascii char. E.g.:
> Bad patch!
> Reviews applied: [37468]
> Failed command: ./support/apply-review.sh -n -r 37468
> Error:
>  2015-08-14 04:22:30 URL:https://reviews.apache.org/r/37468/diff/raw/ 
> [23334/23334] -> "37468.patch" [1]
> Traceback (most recent call last):
>   File "./support/jsonurl.py", line 25, in 
> print data
> UnicodeEncodeError: 'ascii' codec can't encode character u'\xe9' in position 
> 3: ordinal not in range(128)
> Successfully applied: Removed allocation types to mesos::master namespace
> The allocation-related types was moved to mesos::master namespace.
> MESOS-2516
> Review: https://reviews.apache.org/r/37468
> fatal: empty ident name (for ) not allowed
> Failed to commit patch
> In this example, the problem was caused because the full name of the user has 
> the "é" char. For simulate the problem you can run the following shell script:
> `AUTHOR_NAME=$(./support/jsonurl.py 
> https://reviews.apache.org/api/users/jvanz/ user fullname)`
> (This is my user, I removed the non ascii char to send more patches)
> The problem is when the result of the python script is kept in the variable. 
> If you call the python script without store the result in a variable 
> everything works fine.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947170#comment-14947170
 ] 

haosdent commented on MESOS-3599:
-

And my concern here is we change docker health check behaviour again and again, 
user also would confuse...

> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947167#comment-14947167
 ] 

haosdent commented on MESOS-3599:
-

[~ekesken] I totally agree with your cases are normal user cases for health 
check. In fact, we have two way to change our health check in docker executor:

* First way:
Add a option to health check protobuf, so use this option as a flag to 
determine should launch health check outside or inside. But change health  
check protobuf would let the protobuf more confusion. Suppose we have http 
check and tcp check in the future, should this field also works for them or not?

* Second way:
Drop current implemention and revert to old behaviour, if user want to call 
health check in the docker, use "docker exec" instead. But for make user more 
convenience to do operations for docker container, we could import a 
environment variable like "MESOS_CONTAINER_NAME". So you not need use a tricky 
way to get docker container name.

> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3560) JSON-based credential files do not work correctly

2015-10-07 Thread Marco Massenzio (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947148#comment-14947148
 ] 

Marco Massenzio commented on MESOS-3560:


Sure - not a problem.
Do we need to take extra steps to ensure we don't break folks during upgrades?

> JSON-based credential files do not work correctly
> -
>
> Key: MESOS-3560
> URL: https://issues.apache.org/jira/browse/MESOS-3560
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Michael Park
>Assignee: Isabel Jimenez
>  Labels: mesosphere
> Fix For: 0.26.0
>
>
> Specifying the following credentials file:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “password”
> }
>   ]
> }
> {code}
> Then hitting a master endpoint with:
> {code}
> curl -i -u “user:password” ...
> {code}
> Does not work. This is contrary to the text-based credentials file which 
> works:
> {code}
> user password
> {code}
> Currently, the password in a JSON-based credentials file needs to be 
> base64-encoded in order for it to work:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “cGFzc3dvcmQ=”
> }
>   ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3560) JSON-based credential files do not work correctly

2015-10-07 Thread Marco Massenzio (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Marco Massenzio updated MESOS-3560:
---
Fix Version/s: (was: 0.24.0)
   (was: 0.23.0)
   (was: 0.22.0)
   (was: 0.21.0)
   (was: 0.20.0)
   0.26.0

> JSON-based credential files do not work correctly
> -
>
> Key: MESOS-3560
> URL: https://issues.apache.org/jira/browse/MESOS-3560
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Michael Park
>Assignee: Isabel Jimenez
>  Labels: mesosphere
> Fix For: 0.26.0
>
>
> Specifying the following credentials file:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “password”
> }
>   ]
> }
> {code}
> Then hitting a master endpoint with:
> {code}
> curl -i -u “user:password” ...
> {code}
> Does not work. This is contrary to the text-based credentials file which 
> works:
> {code}
> user password
> {code}
> Currently, the password in a JSON-based credentials file needs to be 
> base64-encoded in order for it to work:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “cGFzc3dvcmQ=”
> }
>   ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3215) CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947139#comment-14947139
 ] 

haosdent commented on MESOS-3215:
-

Only happens in CentOS 7.1. I use physical machine.


> CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04
> 
>
> Key: MESOS-3215
> URL: https://issues.apache.org/jira/browse/MESOS-3215
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>  Labels: mesosphere
>
> [ RUN  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> ../../src/tests/containerizer/cgroups_tests.cpp:172: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
> ../../src/tests/containerizer/cgroups_tests.cpp:190: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
> [  FAILED  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf (9 ms)
> [--] 1 test from CgroupsAnyHierarchyWithPerfEventTest (9 ms total)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3215) CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3215?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14947139#comment-14947139
 ] 

haosdent edited comment on MESOS-3215 at 10/7/15 4:22 PM:
--

Also happens in CentOS 7.1. I use physical machine.



was (Author: haosd...@gmail.com):
Only happens in CentOS 7.1. I use physical machine.


> CgroupsAnyHierarchyWithPerfEventTest failing on Ubuntu 14.04
> 
>
> Key: MESOS-3215
> URL: https://issues.apache.org/jira/browse/MESOS-3215
> Project: Mesos
>  Issue Type: Bug
>Reporter: Artem Harutyunyan
>  Labels: mesosphere
>
> [ RUN  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf
> ../../src/tests/containerizer/cgroups_tests.cpp:172: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
> ../../src/tests/containerizer/cgroups_tests.cpp:190: Failure
> (cgroups::destroy(hierarchy, cgroup)).failure(): Failed to remove cgroup 
> '/sys/fs/cgroup/perf_event/mesos_test': Device or resource busy
> [  FAILED  ] CgroupsAnyHierarchyWithPerfEventTest.ROOT_CGROUPS_Perf (9 ms)
> [--] 1 test from CgroupsAnyHierarchyWithPerfEventTest (9 ms total)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3598) move documentation to Sphinx

2015-10-07 Thread Ryuichi Okumura (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3598?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946994#comment-14946994
 ] 

Ryuichi Okumura commented on MESOS-3598:


Supporting translating documents would be great to me. I'd like to contribute 
to Japanese translations especially.

> move documentation to Sphinx
> 
>
> Key: MESOS-3598
> URL: https://issues.apache.org/jira/browse/MESOS-3598
> Project: Mesos
>  Issue Type: Improvement
>  Components: documentation
>Reporter: James Peach
>Priority: Minor
>
> Sphinx has better cross-referencing capabilities than the current Markdown 
> processor, plus the ability to render man pages.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946971#comment-14946971
 ] 

Erhan Kesken edited comment on MESOS-3599 at 10/7/15 2:52 PM:
--

I had to rollback to 0.22.2 because of this issue, I'm sharing some of our 
health check commands we currently use to give you an idea about use cases:

{noformat}
# for mesos-dns:
dig +short leader.mesos @$HOST | grep .
# for docker-registry:
curl -f http://$HOST:5000/v2/ | grep '{}'
# for my python processes
docker logs $(for i in $(docker ps -q --no-trunc); do docker inspect $i | grep 
-sq MESOS_TASK_ID=${MESOS_TASK_ID:?} && echo $i; done) | grep 'Welcome to 
interaction-indexer.'
# for rabbitmq
curl -f http://XX:XX@$HOST:15672/api/vhosts/ | grep 'ZZZ'
{noformat}



was (Author: ekesken):
I had to rollback to 0.22.2 because of this issue, I'm sharing some of our 
health check commands we currently use to give you an idea about use cases:

{no-format}
# for mesos-dns:
dig +short leader.mesos @$HOST | grep .
# for docker-registry:
curl -f http://$HOST:5000/v2/ | grep '{}'
# for my python processes
docker logs $(for i in $(docker ps -q --no-trunc); do docker inspect $i | grep 
-sq MESOS_TASK_ID=${MESOS_TASK_ID:?} && echo $i; done) | grep 'Welcome to 
interaction-indexer.'
# for rabbitmq
curl -f http://XX:XX@$HOST:15672/api/vhosts/ | grep 'ZZZ'
{no-format}


> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946971#comment-14946971
 ] 

Erhan Kesken commented on MESOS-3599:
-

I had to rollback to 0.22.2 because of this issue, I'm sharing some of our 
health check commands we currently use to give you an idea about use cases:

{no-format}
# for mesos-dns:
dig +short leader.mesos @$HOST | grep .
# for docker-registry:
curl -f http://$HOST:5000/v2/ | grep '{}'
# for my python processes
docker logs $(for i in $(docker ps -q --no-trunc); do docker inspect $i | grep 
-sq MESOS_TASK_ID=${MESOS_TASK_ID:?} && echo $i; done) | grep 'Welcome to 
interaction-indexer.'
# for rabbitmq
curl -f http://XX:XX@$HOST:15672/api/vhosts/ | grep 'ZZZ'
{no-format}


> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3559) Make the Command Scheduler use the HTTP Scheduler Library

2015-10-07 Thread Guangya Liu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3559?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guangya Liu reassigned MESOS-3559:
--

Assignee: Guangya Liu

> Make the Command Scheduler use the HTTP Scheduler Library
> -
>
> Key: MESOS-3559
> URL: https://issues.apache.org/jira/browse/MESOS-3559
> Project: Mesos
>  Issue Type: Task
>Reporter: Anand Mazumdar
>Assignee: Guangya Liu
>  Labels: mesosphere
>
> We should make the Command Scheduler in {{src/cli/executor.cpp}} use the 
> Scheduler Library {{src/scheduler/scheduler.cpp}} instead of the Scheduler 
> Driver.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3517) Building mesos from source fails when OS language is not English

2015-10-07 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier updated MESOS-3517:

Shepherd: Till Toenshoff

> Building mesos from source fails when OS language is not English
> 
>
> Key: MESOS-3517
> URL: https://issues.apache.org/jira/browse/MESOS-3517
> Project: Mesos
>  Issue Type: Bug
>  Components: build
> Environment: Dutch locale on Ubuntu 15.04
>Reporter: Wessel Nieboer
>Assignee: Benjamin Bannier
>Priority: Minor
>
> Line 963 of mesos/3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 
> contains the following:
>   EXPECT_TRUE(strings::contains(result.get(), "No such file or directory"));
> But this does not match when your locale is not English. When changing it to 
> what my terminal gives me: "Bestand of map bestaat niet" then it works just 
> fine. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (MESOS-3517) Building mesos from source fails when OS language is not English

2015-10-07 Thread Benjamin Bannier (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Bannier reassigned MESOS-3517:
---

Assignee: Benjamin Bannier

> Building mesos from source fails when OS language is not English
> 
>
> Key: MESOS-3517
> URL: https://issues.apache.org/jira/browse/MESOS-3517
> Project: Mesos
>  Issue Type: Bug
>  Components: build
> Environment: Dutch locale on Ubuntu 15.04
>Reporter: Wessel Nieboer
>Assignee: Benjamin Bannier
>Priority: Minor
>
> Line 963 of mesos/3rdparty/libprocess/3rdparty/stout/tests/os_tests.cpp 
> contains the following:
>   EXPECT_TRUE(strings::contains(result.get(), "No such file or directory"));
> But this does not match when your locale is not English. When changing it to 
> what my terminal gives me: "Bestand of map bestaat niet" then it works just 
> fine. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946760#comment-14946760
 ] 

haosdent commented on MESOS-3599:
-

[~tnachen] Do you think Mesos should add a health check for docker which could 
run outside?

> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3560) JSON-based credential files do not work correctly

2015-10-07 Thread Michael Park (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946756#comment-14946756
 ] 

Michael Park commented on MESOS-3560:
-

>From what I can tell, what [~adam-mesos] mentioned in his presentation was 
>explaining why the {{secret}} field is {{optional}}, not why the type of 
>{{secret}} is {{bytes}}. That is, {{Kerberos}} for example passes its tickets 
>out-of-band, so it doesn't use the {{secret}} part of {{Credential}}, just 
>{{principal}}.

I would much prefer to actually solve this problem now rather than just 
documenting the behavior. Since the current behavior is inconsistent and 
unintended, we'll end up breaking more people later when we go to fix it.

> JSON-based credential files do not work correctly
> -
>
> Key: MESOS-3560
> URL: https://issues.apache.org/jira/browse/MESOS-3560
> Project: Mesos
>  Issue Type: Bug
>  Components: master
>Reporter: Michael Park
>Assignee: Isabel Jimenez
>  Labels: mesosphere
> Fix For: 0.20.0, 0.21.0, 0.22.0, 0.23.0, 0.24.0
>
>
> Specifying the following credentials file:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “password”
> }
>   ]
> }
> {code}
> Then hitting a master endpoint with:
> {code}
> curl -i -u “user:password” ...
> {code}
> Does not work. This is contrary to the text-based credentials file which 
> works:
> {code}
> user password
> {code}
> Currently, the password in a JSON-based credentials file needs to be 
> base64-encoded in order for it to work:
> {code}
> {
>   “credentials”: [
> {
>   “principal”: “user”,
>   “secret”: “cGFzc3dvcmQ=”
> }
>   ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (MESOS-3599) COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread haosdent (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

haosdent updated MESOS-3599:

Summary: COMMAND health checks with marathon running in slave context 
broken  (was: CLONE - COMMAND health checks with marathon running in slave 
context broken)

> COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3551) Replace use of strerror with thread-safe alternatives strerror_r / strerror_l.

2015-10-07 Thread Benjamin Bannier (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3551?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946733#comment-14946733
 ] 

Benjamin Bannier commented on MESOS-3551:
-

OK, the 0th implementation brought some open ends to the surface.

h5. strerror_r vs. strerror_l

To not break OS X compatibility {{strerror_l}} seems out of reach and we need 
to resort to {{strerror_r}}.

h5. Glibc provides a non-compliant and potentially broken {{strerror_r}}

Since {{strerror}} uses are currently all across stout (a few), libprocess (a 
handful), and mesos (plenty), a natural place to implement a wrapper should 
probably be in stout.

Since stout is header-only a reusable wrapper implementation would probably 
under the covers use any available implementation (or: _Should this be a reason 
to (a) implement the wrapper higher up, e.g. in libprocess, or (b) make stout 
include compiled components, or (c) no, leave it in stout?_).


Assuming we decide to implement this wrapper in a header we would also decide 
on how to deal with a bug in glibc-2.15:

bq. https://sourceware.org/bugzilla/show_bug.cgi?id=12204

Here glibc's {{strerror_r}} might set the global {{errno}} should it run into 
errors itself (e.g. because the passed errnum was invalid) which is not 
compliant and probably unexpected. Fixed versions where shipped e.g. starting 
with Debian8, CentOS7, Ubuntu14.04.

Since the mesos {{configure.ac}} already requires at least gcc-4.8.0 which is 
not satisfied by stock Debian7 (gcc-4.7.2), CentOS6 (gcc-4.4.7), or Ubuntu12.04 
(gcc-4.6.3) _we could provide an implementation for either (a) only 
glibc-2.16+, or (b) introduce a workaround if we are using an old version_. If 
we decide on (a) it appears that adding a configure check wouldn't be 
sufficient to prevent someone from including the header from stout so we would 
need to add checks and potentially {{#errors}} in the implementation itself.


h5. Localized error messages

We cannot know the maximal length of error messages since they might be 
localized. We could either

  (a) implement an algorithm growing the buffer used by {{strerror_r}} until 
the message fits in, or
  (b) use a fixed buffer size with an educated guess about the maximal error 
message length (say 2000 char like used in {{llvm::sys::StrError}}).

Given the complexity workarounds for glibc non-conformance might introduce I 
feel option (b) might be good enough for now.


Any input welcome.


> Replace use of strerror with thread-safe alternatives strerror_r / strerror_l.
> --
>
> Key: MESOS-3551
> URL: https://issues.apache.org/jira/browse/MESOS-3551
> Project: Mesos
>  Issue Type: Bug
>  Components: libprocess, stout
>Reporter: Benjamin Mahler
>Assignee: Benjamin Bannier
>  Labels: newbie, tech-debt
>
> {{strerror()}} is not required to be thread safe by POSIX and is listed as 
> unsafe on Linux:
> http://pubs.opengroup.org/onlinepubs/9699919799/
> http://man7.org/linux/man-pages/man3/strerror.3.html
> I don't believe we've seen any issues reported due to this. We should replace 
> occurrences of strerror accordingly, possibly offering a wrapper in stout to 
> simplify callsites.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3599) CLONE - COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946596#comment-14946596
 ] 

Erhan Kesken commented on MESOS-3599:
-

for details about problem check following comment at MESOS-3136

https://issues.apache.org/jira/browse/MESOS-3136?focusedCommentId=14946378


> CLONE - COMMAND health checks with marathon running in slave context broken
> ---
>
> Key: MESOS-3599
> URL: https://issues.apache.org/jira/browse/MESOS-3599
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Erhan Kesken
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946594#comment-14946594
 ] 

Erhan Kesken commented on MESOS-3136:
-

I cloned this issue as adam suggested: MESOS-3599, thank you.

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-3599) CLONE - COMMAND health checks with marathon running in slave context broken

2015-10-07 Thread Erhan Kesken (JIRA)
Erhan Kesken created MESOS-3599:
---

 Summary: CLONE - COMMAND health checks with marathon running in 
slave context broken
 Key: MESOS-3599
 URL: https://issues.apache.org/jira/browse/MESOS-3599
 Project: Mesos
  Issue Type: Bug
Affects Versions: 0.23.0
Reporter: Erhan Kesken
Assignee: haosdent
Priority: Critical
 Fix For: 0.23.1, 0.24.1, 0.25.0


When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
check stop working. Rolling back to Mesos 0.22.1 fixes the problem.

Containerizer is Docker.
All packages are from official Mesosphere Ubuntu 14.04 sources.

The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946570#comment-14946570
 ] 

Erhan Kesken commented on MESOS-3136:
-

sorry for my lack of attention, I had used parameter as --launcher_dir as it 
supposed to be, I'd written wrong versions at my comment here.

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946560#comment-14946560
 ] 

haosdent edited comment on MESOS-3136 at 10/7/15 9:17 AM:
--

I remember should be {noformat}--launcher_dir={noformat} not 
{noformat}--launch-dir{noformat}. you missing {noformat}er{noformat} and use 
wrong {noformat}-{noformat}. And if you think running health check outside 
docker container is necessary, please feel free to open a new ticket as adam 
said. If that issue is acceptted, I would submit a patch for that as soon as 
possible.


was (Author: haosd...@gmail.com):
I remember should be "--launcher_dir=" not "--launch-dir". you missing "er" and 
use wrong "-". 

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946563#comment-14946563
 ] 

Erhan Kesken commented on MESOS-3136:
-

as you stated, I think both option is required, at least at mesos-slave context 
times we had a chance to apply a workaround to hold both options in our hands, 
with an ugly line like this:

{noformat}
docker exec $(for i in $(docker ps -q --no-trunc); do docker inspect $i | grep 
-sq MESOS_TASK_ID=${MESOS_TASK_ID:?} && echo $i; done) ls /
{noformat}

but now in docker context there is no workaround for getting other option. so I 
see issue as a MUST.

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946560#comment-14946560
 ] 

haosdent commented on MESOS-3136:
-

I remember should be "--launcher_dir=" not "--launch-dir". you missing "er" and 
use wrong "-". 

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946545#comment-14946545
 ] 

Erhan Kesken commented on MESOS-3136:
-

both mesos-docker-executor and mesos-health-check binaries are under 
/usr/libexec/mesos folder.

before passing --launch-dir to mesos-slave, mesos-docker-executor was failing, 
launch_dir fixed that problem but at next step health_check failed.

when I checked code at 
https://github.com/apache/mesos/blob/5058fac1083dc91bca54d33c26c810c17ad95dd1/src/docker/executor.cpp#L573,
 I concluded that --launch-dir parameter does not set MESOS_LAUNCHER_DIR 
environment variable.

I removed --launch-dir parameter and put MESOS_LAUNCHER_DIR enviroment setting 
to mesos-slave process by editing /etc/default/mesos-slave file. I confirmed 
new enviroment setting is active by checking /proc//environ file. 
mesos-docker-executor did not fail like I passed --launch-dir parameter, but 
mesos-health-check failed again.

I checked mesos-docker-executor process environment from proc, but 
MESOS_LAUNCHER_DIR was not there. finally I solved problem by putting 
environment variable to my marathon config after understanding health checks 
are running inside docker.



> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread Adam B (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946490#comment-14946490
 ] 

Adam B commented on MESOS-3136:
---

I can understand the desire to run in the docker, so that it has access to all 
the environment/deps inside the container; or wanting to run it outside the 
docker so you could query the docker engine itself. Perhaps only outside the 
container is necessary, because the healthcheck command could specify to run 
`docker exec whatever` if it wants to run something inside the container.

If we decide that this is an issue to pursue, let's please open a new JIRA 
(clone this one, or otherwise link it), so we can track the new issue 
separately from the one we've already shipped a "fix" for. Thanks!

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946456#comment-14946456
 ] 

haosdent commented on MESOS-3136:
-

You need set MESOS_LAUNCHER_DIR or make sure mesos-docker-executor and 
mesos-health-check under same folder.

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread haosdent (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946444#comment-14946444
 ] 

haosdent commented on MESOS-3136:
-

A bit confuse, should the health check running outside docker or in docker? In 
our before discussion, we think should put the health check inside container, 
so we use "docker exec".

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-3136) COMMAND health checks with Marathon 0.10.0 are broken

2015-10-07 Thread Erhan Kesken (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-3136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946393#comment-14946393
 ] 

Erhan Kesken commented on MESOS-3136:
-

I tested mesos 0.24.1 with marathon 0.11.0, I'm using puppet module 
https://github.com/deric/puppet-mesos to install mesos package, I needed to add 
--launch-dir=/usr/libexec/mesos parameter to mesos-slave process, but that was 
not enough I also needed to put "MESOS_LAUNCHER_DIR": "/usr/libexec/mesos" line 
into env dict of my marathon config as well, otherwise launchhealthcheck 
command can not find place of mesos-health-check command. Is there a more 
proper solution for this problem?

> COMMAND health checks with Marathon 0.10.0 are broken
> -
>
> Key: MESOS-3136
> URL: https://issues.apache.org/jira/browse/MESOS-3136
> Project: Mesos
>  Issue Type: Bug
>Affects Versions: 0.23.0
>Reporter: Dr. Stefan Schimanski
>Assignee: haosdent
>Priority: Critical
> Fix For: 0.23.1, 0.24.1, 0.25.0
>
> Attachments: MESOS-3136_0_23_0.patch, MESOS-3136_0_24_0.patch
>
>
> When deploying Mesos 0.23rc4 with latest Marathon 0.10.0 RC3 command health 
> check stop working. Rolling back to Mesos 0.22.1 fixes the problem.
> Containerizer is Docker.
> All packages are from official Mesosphere Ubuntu 14.04 sources.
> The issue must be analyzed further.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)