[jira] [Updated] (MESOS-7617) UCR cannot read docker images containing long file paths

2017-06-02 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7617:
--
Affects Version/s: 1.3.0
   1.1.2
   1.2.0

> UCR cannot read docker images containing long file paths
> 
>
> Key: MESOS-7617
> URL: https://issues.apache.org/jira/browse/MESOS-7617
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.1.2, 1.2.0, 1.3.0, 1.3.1
>Reporter: Chun-Hung Hsiao
>  Labels: containerizer, triaged
>
> The latest Docker uses go 1.7.5 
> (https://github.com/moby/moby/blob/master/CHANGELOG.md#contrib-1), in which 
> the {{archive/tar}} package has a bug that cannot handle file paths longer 
> than 100 characters (https://github.com/golang/go/issues/17630). As a result, 
> Docker will generate images containing ill-formed tar files (details below) 
> when there are long paths. Docker itself understands the ill-formed image 
> fine, but a standard tar program will interpret the image as if all files 
> with long paths are placed under the root directory 
> (https://github.com/moby/moby/issues/29360).
> This bug has been fixed in go 1.8, but since Docker is still using the bugged 
> version, we might need to handle these ill-formed images created by Dcoker 
> utilities.
> NOTE: It is confirmed that the {{archive/tar}} package in go 1.8 cannot 
> correctly extract the ill-formed tar files, but the one in go 1.7.5 could.
> Details: the {{archive/tar}} package uses {{USTAR}} format to handle files 
> with 100+-character-long paths (by only putting file name in the {{name}} 
> field and the path in the {{prefix}} field in the tar header), but uses 
> {{OLDGNU}}'s magic string, which does not understand the {{prefix}} field, so 
> a standard tar program will extract such files under the current directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7617) UCR cannot read docker images containing long file paths

2017-06-02 Thread Jie Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jie Yu updated MESOS-7617:
--
Affects Version/s: 1.3.1

> UCR cannot read docker images containing long file paths
> 
>
> Key: MESOS-7617
> URL: https://issues.apache.org/jira/browse/MESOS-7617
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.1.2, 1.2.0, 1.3.0, 1.3.1
>Reporter: Chun-Hung Hsiao
>  Labels: containerizer, triaged
>
> The latest Docker uses go 1.7.5 
> (https://github.com/moby/moby/blob/master/CHANGELOG.md#contrib-1), in which 
> the {{archive/tar}} package has a bug that cannot handle file paths longer 
> than 100 characters (https://github.com/golang/go/issues/17630). As a result, 
> Docker will generate images containing ill-formed tar files (details below) 
> when there are long paths. Docker itself understands the ill-formed image 
> fine, but a standard tar program will interpret the image as if all files 
> with long paths are placed under the root directory 
> (https://github.com/moby/moby/issues/29360).
> This bug has been fixed in go 1.8, but since Docker is still using the bugged 
> version, we might need to handle these ill-formed images created by Dcoker 
> utilities.
> NOTE: It is confirmed that the {{archive/tar}} package in go 1.8 cannot 
> correctly extract the ill-formed tar files, but the one in go 1.7.5 could.
> Details: the {{archive/tar}} package uses {{USTAR}} format to handle files 
> with 100+-character-long paths (by only putting file name in the {{name}} 
> field and the path in the {{prefix}} field in the tar header), but uses 
> {{OLDGNU}}'s magic string, which does not understand the {{prefix}} field, so 
> a standard tar program will extract such files under the current directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7095) Basic make check from getting started link fails

2017-06-02 Thread Benjamin Mahler (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035652#comment-16035652
 ] 

Benjamin Mahler commented on MESOS-7095:


[~tillt] does the getting started guide need any updates related to this so 
that users don't hit it?
http://mesos.apache.org/gettingstarted/

> Basic make check from getting started link fails
> 
>
> Key: MESOS-7095
> URL: https://issues.apache.org/jira/browse/MESOS-7095
> Project: Mesos
>  Issue Type: Bug
>  Components: build
>Reporter: Alec Bruns
>
> {*** Aborted at 1486657215 (unix time) try "date -d @1486657215" if you are 
> using GNU date *** PC: @0x1080b7367 apr_pool_create_ex *** SIGSEGV 
> (@0x30) received by PID 25167 (TID 0x7fffbdd073c0) stack trace: ***} 
> \{@ 0x7fffb50c7bba _sigtramp 
> @\{ 0x72c0517 (unknown)\} 
> @0x107eaa13a svn_pool_create_ex 
> @0x107691d6e svn::diff() 
> @0x107691042 SVNTest_DiffPatch_Test::TestBody()
>  @0x1077026ba 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>() 
> @0x1076b3ad7 
> testing::internal::HandleExceptionsInMethodIfSupported<>()
>  @0x1076b3985 testing::Test::Run() 
> @0x1076b54f8 testing::TestInfo::Run() 
> @0x1076b6867 testing::TestCase::Run() 
> @0x1076c65dc testing::internal::UnitTestImpl::RunAllTests() 
> @0x1077033da 
> testing::internal::HandleSehExceptionsInMethodIfSupported<>() 
> @0x1076c6007 
> testing::internal::HandleExceptionsInMethodIfSupported<>() 
> @0x1076c5ed8 testing::UnitTest::Run() 
> @0x1074d55c1 RUN_ALL_TESTS() 
> @0x1074d5580 main 
> @ 0x7fffb4eba255 start 
> make[6]: *** [check-local] Segmentation fault: 11 
> make[5]: *** [check-am] Error 2 make[4]: *** [check-recursive] Error 1
>  make[3]: *** [check] Error 2 make[2]: *** [check-recursive] Error 1 
> make[1]: *** [check] Error 2 make: *** [check-recursive] Error 1
> make: *** [check-recursive] Error 1



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-1606) Slave failed to checkpoint on Mac OS X

2017-06-02 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035631#comment-16035631
 ] 

Neil Conway commented on MESOS-1606:


Perhaps a disk I/O error, e.g., due to a flaky disk?

> Slave failed to checkpoint on Mac OS X
> --
>
> Key: MESOS-1606
> URL: https://issues.apache.org/jira/browse/MESOS-1606
> Project: Mesos
>  Issue Type: Bug
>  Components: agent
> Environment: Mac OS X, Darwin Kernel Version 13.3.0
>Reporter: Zuyu Zhang
>
> {noformat}
> This bug happens to test_framework and LowLevelSchedulerLibprocess as well.
> [ RUN  ] ExamplesTest.LowLevelSchedulerPthread
> Using temporary directory '/tmp/ExamplesTest_LowLevelSchedulerPthread_SCL6Al'
> Enabling authentication for the scheduler
> I0715 19:03:59.296200 2019271440 scheduler.cpp:132] Version: 0.20.0
> I0715 19:03:59.300429 2019271440 leveldb.cpp:176] Opened db in 1982us
> I0715 19:03:59.300900 2019271440 leveldb.cpp:183] Compacted db in 447us
> I0715 19:03:59.300946 2019271440 leveldb.cpp:198] Created db iterator in 27us
> I0715 19:03:59.300978 2019271440 leveldb.cpp:204] Seeked to beginning of db 
> in 16us
> I0715 19:03:59.301007 2019271440 leveldb.cpp:273] Iterated through 0 keys in 
> the db in 20us
> I0715 19:03:59.301053 2019271440 replica.cpp:741] Replica recovered with log 
> positions 0 -> 0 with 1 holes and 0 unlearned
> I0715 19:03:59.301713 222965760 recover.cpp:425] Starting replica recovery
> I0715 19:03:59.301914 222965760 recover.cpp:451] Replica is in EMPTY status
> I0715 19:03:59.302671 221892608 replica.cpp:638] Replica in EMPTY status 
> received a broadcasted recover request
> I0715 19:03:59.302781 224575488 recover.cpp:188] Received a recover response 
> from a replica in EMPTY status
> I0715 19:03:59.303050 225112064 recover.cpp:542] Updating replica status to 
> STARTING
> I0715 19:03:59.303432 222965760 leveldb.cpp:306] Persisting metadata (8 
> bytes) to leveldb took 298us
> I0715 19:03:59.303475 222965760 replica.cpp:320] Persisted replica status to 
> STARTING
> I0715 19:03:59.303540 221356032 recover.cpp:451] Replica is in STARTING status
> I0715 19:03:59.303797 224575488 master.cpp:288] Master 
> 20140715-190359-16777343-64313-60122 (localhost) started on 127.0.0.1:64313
> I0715 19:03:59.303848 224575488 master.cpp:325] Master only allowing 
> authenticated frameworks to register
> I0715 19:03:59.303865 224575488 master.cpp:332] Master allowing 
> unauthenticated slaves to register
> I0715 19:03:59.303884 224575488 credentials.hpp:36] Loading credentials for 
> authentication from 
> '/tmp/ExamplesTest_LowLevelSchedulerPthread_SCL6Al/credentials'
> W0715 19:03:59.303961 224575488 credentials.hpp:51] Permissions on 
> credentials file 
> '/tmp/ExamplesTest_LowLevelSchedulerPthread_SCL6Al/credentials' are too open. 
> It is recommended that your credentials file is NOT accessible by others.
> I0715 19:03:59.304028 224575488 master.cpp:359] Authorization enabled
> I0715 19:03:59.304379 223502336 replica.cpp:638] Replica in STARTING status 
> received a broadcasted recover request
> I0715 19:03:59.304505 2019271440 containerizer.cpp:124] Using isolation: 
> posix/cpu,posix/mem
> I0715 19:03:59.304666 223502336 recover.cpp:188] Received a recover response 
> from a replica in STARTING status
> I0715 19:03:59.304805 223502336 recover.cpp:542] Updating replica status to 
> VOTING
> I0715 19:03:59.305186 223502336 leveldb.cpp:306] Persisting metadata (8 
> bytes) to leveldb took 214us
> I0715 19:03:59.305219 223502336 replica.cpp:320] Persisted replica status to 
> VOTING
> I0715 19:03:59.305250 223502336 recover.cpp:556] Successfully joined the 
> Paxos group
> I0715 19:03:59.305361 223502336 recover.cpp:440] Recover process terminated
> I0715 19:03:59.305927 224038912 slave.cpp:168] Slave started on 
> 1)@127.0.0.1:64313
> I0715 19:03:59.306221 224038912 slave.cpp:279] Slave resources: cpus(*):4; 
> mem(*):7168; disk(*):470714; ports(*):[31000-32000]
> I0715 19:03:59.306234 2019271440 containerizer.cpp:124] Using isolation: 
> posix/cpu,posix/mem
> I0715 19:03:59.306248 223502336 master.cpp:1128] The newly elected leader is 
> master@127.0.0.1:64313 with id 20140715-190359-16777343-64313-60122
> I0715 19:03:59.306269 223502336 master.cpp:1141] Elected as the leading 
> master!
> I0715 19:03:59.306293 223502336 master.cpp:959] Recovering from registrar
> I0715 19:03:59.306395 225112064 registrar.cpp:313] Recovering registrar
> I0715 19:03:59.306617 221892608 log.cpp:656] Attempting to start the writer
> I0715 19:03:59.306952 224575488 slave.cpp:168] Slave started on 
> 2)@127.0.0.1:64313
> I0715 19:03:59.307158 224575488 slave.cpp:279] Slave resources: cpus(*):4; 
> mem(*):7168; disk(*):470714; ports(*):[31000-32000]
> I0715 19:03:59.307207 222965760 replica.cpp:474] Replica received

[jira] [Created] (MESOS-7617) UCR cannot read docker images containing long file paths

2017-06-02 Thread Chun-Hung Hsiao (JIRA)
Chun-Hung Hsiao created MESOS-7617:
--

 Summary: UCR cannot read docker images containing long file paths
 Key: MESOS-7617
 URL: https://issues.apache.org/jira/browse/MESOS-7617
 Project: Mesos
  Issue Type: Bug
  Components: containerization
Reporter: Chun-Hung Hsiao


The latest Docker uses go 1.7.5 
(https://github.com/moby/moby/blob/master/CHANGELOG.md#contrib-1), in which the 
{{archive/tar}} package has a bug that cannot handle file paths longer than 100 
characters (https://github.com/golang/go/issues/17630). As a result, Docker 
will generate images containing ill-formed tar files (details below) when there 
are long paths. Docker itself understands the ill-formed image fine, but a 
standard tar program will interpret the image as if all files with long paths 
are placed under the root directory (https://github.com/moby/moby/issues/29360).

This bug has been fixed in go 1.8, but since Docker is still using the bugged 
version, we might need to handle these ill-formed images created by Dcoker 
utilities.

NOTE: It is confirmed that the {{archive/tar}} package in go 1.8 cannot 
correctly extract the ill-formed tar files, but the one in go 1.7.5 could.

Details: the {{archive/tar}} package uses {{USTAR}} format to handle files with 
100+-character-long paths (by only putting file name in the {{name}} field and 
the path in the {{prefix}} field in the tar header), but uses {{OLDGNU}}'s 
magic string, which does not understand the {{prefix}} field, so a standard tar 
program will extract such files under the current directory.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-7458) webui display of framework resources is confusing

2017-06-02 Thread Vinod Kone (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035363#comment-16035363
 ] 

Vinod Kone commented on MESOS-7458:
---

[~haosd...@gmail.com] Are you actively working on this? We need a fix for this 
ASAP, so if you do not have cycles, I would like to take over. Thanks.

> webui display of framework resources is confusing
> -
>
> Key: MESOS-7458
> URL: https://issues.apache.org/jira/browse/MESOS-7458
> Project: Mesos
>  Issue Type: Bug
>  Components: webui
>Reporter: Neil Conway
>Assignee: haosdent
>  Labels: mesosphere
> Attachments: Screen Shot 2017-05-04 at 11.15.12 AM.png, Screen Shot 
> 2017-05-04 at 11.15.25 AM.png
>
>
> In the webui, the list of frameworks displays the {{used_resources}} for each 
> framework. When you click on the framework to access the per-framework page, 
> the resources displayed are the *total* resources (the {{resources}} key in 
> state.json, which is {{used_resources}} + {{offered_resources}}). This is 
> confusing in situations when the offered resources are very different from 
> the used resources.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Comment Edited] (MESOS-7610) Support domains in master and agent

2017-06-02 Thread Neil Conway (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035339#comment-16035339
 ] 

Neil Conway edited comment on MESOS-7610 at 6/2/17 8:10 PM:


https://reviews.apache.org/r/59761/
https://reviews.apache.org/r/59762/


was (Author: neilc):
https://reviews.apache.org/r/59761/

> Support domains in master and agent
> ---
>
> Key: MESOS-7610
> URL: https://issues.apache.org/jira/browse/MESOS-7610
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Updated] (MESOS-7608) Protobuf definitions for domains

2017-06-02 Thread Neil Conway (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Neil Conway updated MESOS-7608:
---
Description: (was: https://reviews.apache.org/r/59759/)

https://reviews.apache.org/r/59759/

> Protobuf definitions for domains
> 
>
> Key: MESOS-7608
> URL: https://issues.apache.org/jira/browse/MESOS-7608
> Project: Mesos
>  Issue Type: Improvement
>Reporter: Neil Conway
>Assignee: Neil Conway
>  Labels: mesosphere
>




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7616) Consider supporting changes to agent's domain without full drain.

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7616:
--

 Summary: Consider supporting changes to agent's domain without 
full drain.
 Key: MESOS-7616
 URL: https://issues.apache.org/jira/browse/MESOS-7616
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Assignee: Neil Conway


In the initial review chain, any change to an agent's domain requires a full 
drain. This is simple and straightforward, but it makes it more difficult for 
operators to opt-in to using fault domains.

We should consider allowing agents to transition from "no configured domain" to 
"configured domain" without requiring an agent drain. This has some 
complications, however: e.g., without an API for communicating changes in an 
agent's configuration to frameworks, they might not realize that an agent's 
domain has changed.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7615) Report registration errors to agents.

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7615:
--

 Summary: Report registration errors to agents.
 Key: MESOS-7615
 URL: https://issues.apache.org/jira/browse/MESOS-7615
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Priority: Minor


Agent registration attempts might be ignored by the master for various reasons, 
such as:

* the agent's version number is malformed
* the agent has a configured domain but the master does not
* agent registration message fails validation

When this occurs, the master writes a warning message to its log, but it would 
also be nice for it to send the agent a warning message; this would let the 
agent understand/log why it hasn't successfully registered.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7614) Only offer resources on remote agents to region-aware frameworks

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7614:
--

 Summary: Only offer resources on remote agents to region-aware 
frameworks
 Key: MESOS-7614
 URL: https://issues.apache.org/jira/browse/MESOS-7614
 Project: Mesos
  Issue Type: Improvement
  Components: allocation
Reporter: Neil Conway
Assignee: Neil Conway


If both master and agent are configured with domains, frameworks that are not 
region-aware should not receive offers for resources on agents in remote 
regions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7613) Unit test for master behavior with mixed regions

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7613:
--

 Summary: Unit test for master behavior with mixed regions
 Key: MESOS-7613
 URL: https://issues.apache.org/jira/browse/MESOS-7613
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Assignee: Neil Conway


It would be nice to write unit tests to check that:

* A standby master joins the Zk group if it has the same region and zone as the 
leading master
* A standby master joins the Zk group if it has the same region as the leading 
master but a different zone
* A standby master joins the Zk group if it has no configured domain but the 
leading master has a configured domain.
* A standby master joins the Zk group if it has a configured domain but the 
leading master does not have a configured domain.
* A standby master aborts with an error message if it is configured to use a 
different region than the leading master.

Unfortunately, we cannot easily test this scenario due to MESOS-2976.




--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7612) Prevent agent with misconfigured domain from registering

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7612:
--

 Summary: Prevent agent with misconfigured domain from registering
 Key: MESOS-7612
 URL: https://issues.apache.org/jira/browse/MESOS-7612
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Assignee: Neil Conway


We expect that the master's domain will be configured before the agent's 
domain. Hence, if an agent with configured domain attempts to register with a 
master that has no configured domain, its registration attempt should be 
ignored.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7611) Prevent master from joining mixed-region cluster

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7611:
--

 Summary: Prevent master from joining mixed-region cluster
 Key: MESOS-7611
 URL: https://issues.apache.org/jira/browse/MESOS-7611
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Assignee: Neil Conway


If a master with configured region X joins a cluster where the leading master 
has configured region Y, it should abort with an error message. This enforces 
the invariant that all the masters in the same Mesos cluster are configured to 
use the same region.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7610) Support domains in master and agent

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7610:
--

 Summary: Support domains in master and agent
 Key: MESOS-7610
 URL: https://issues.apache.org/jira/browse/MESOS-7610
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Assignee: Neil Conway






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7609) Protobuf definitions for region-aware framework capability

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7609:
--

 Summary: Protobuf definitions for region-aware framework capability
 Key: MESOS-7609
 URL: https://issues.apache.org/jira/browse/MESOS-7609
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Assignee: Neil Conway






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7608) Protobuf definitions for domains

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7608:
--

 Summary: Protobuf definitions for domains
 Key: MESOS-7608
 URL: https://issues.apache.org/jira/browse/MESOS-7608
 Project: Mesos
  Issue Type: Improvement
Reporter: Neil Conway
Assignee: Neil Conway






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7607) Support for first-class fault domains.

2017-06-02 Thread Neil Conway (JIRA)
Neil Conway created MESOS-7607:
--

 Summary: Support for first-class fault domains.
 Key: MESOS-7607
 URL: https://issues.apache.org/jira/browse/MESOS-7607
 Project: Mesos
  Issue Type: Epic
Reporter: Neil Conway
Assignee: Neil Conway


Mesos should support a first-class notion of "fault domains", which effectively 
provide a common vocabulary for describing the region and zone where a node 
(either master or agent) is located.

Design doc: 
https://drive.google.com/open?id=1gEugdkLRbBsqsiFv3urRPRNrHwUC-i1HwfFfHR_MvC8



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Commented] (MESOS-6556) Hostname support for the network/cni isolator.

2017-06-02 Thread James DeFelice (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034991#comment-16034991
 ] 

James DeFelice commented on MESOS-6556:
---

{{hostname}} is only applied when there are container networks present. When 
using host-mode networking, the UTS namespace is not isolated and {{hostname}} 
is not applied to the container. Tracking via 
https://issues.apache.org/jira/browse/MESOS-7605

> Hostname support for the network/cni isolator.
> --
>
> Key: MESOS-6556
> URL: https://issues.apache.org/jira/browse/MESOS-6556
> Project: Mesos
>  Issue Type: Improvement
>  Components: containerization
>Reporter: James Peach
>Assignee: James Peach
>Priority: Minor
> Fix For: 1.2.0
>
>
> -Add a {{namespace/uts}} isolator for doing UTS namespace isolation without 
> using the CNI isolator.-
> Update the {{network/cni}} isolator to set the hostname specified by the task 
> info.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Created] (MESOS-7606) Hierarchical allocator seems to perform redundant activation / deactivation of newly added frameworks.

2017-06-02 Thread Alexander Rukletsov (JIRA)
Alexander Rukletsov created MESOS-7606:
--

 Summary: Hierarchical allocator seems to perform redundant 
activation / deactivation of newly added frameworks.
 Key: MESOS-7606
 URL: https://issues.apache.org/jira/browse/MESOS-7606
 Project: Mesos
  Issue Type: Bug
  Components: allocation
Affects Versions: 1.3.0
Reporter: Alexander Rukletsov


According to the logs,
{noformat}
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.226405 29728 hierarchical.cpp:379] Deactivated framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: 
I0601 11:32:58.228570 29728 hierarchical.cpp:343] Activated framework 
6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
{noformat}
the built-in allocator ensures that upon addition, a framework is deactivated 
first and then activates it again. This seems to be redundant: if a sorter 
client should always start deactivated, we should not call deactivate on it but 
rather add it in a way that it is deactivated. This will naturally eliminate 
the logging issue.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-7601) Some container launch failures are mistakenly treated as errors.

2017-06-02 Thread Alexander Rukletsov (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Rukletsov reassigned MESOS-7601:
--

Assignee: Alexander Rukletsov

https://reviews.apache.org/r/59746/

> Some container launch failures are mistakenly treated as errors.
> 
>
> Key: MESOS-7601
> URL: https://issues.apache.org/jira/browse/MESOS-7601
> Project: Mesos
>  Issue Type: Bug
>  Components: containerization
>Affects Versions: 1.3.0
>Reporter: Alexander Rukletsov
>Assignee: Alexander Rukletsov
>  Labels: containerizer, mesosphere
>
> I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some 
> of its tasks are being launched. While this is a valid behaviour, the agent 
> prints an error and increased container launch errors metrics.
> Below are log excerpts for such framework, 
> {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}.
> *Master log*
> {noformat}
> [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" 
> --no-pager | grep 
> "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092"
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating 
> info for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added 
> framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] 
> Deactivated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] 
> Activated framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing 
> DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for 
> framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 
> offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 
> inverse offers to framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 
> (TeraValidate) at 
> scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531
> Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal 
> mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing 
> ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on 
> agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 
> (172.31.13.122) for framework 
> 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driv

[jira] [Created] (MESOS-7605) UCR doesn't isolate uts namespace w/ host networking

2017-06-02 Thread James DeFelice (JIRA)
James DeFelice created MESOS-7605:
-

 Summary: UCR doesn't isolate uts namespace w/ host networking
 Key: MESOS-7605
 URL: https://issues.apache.org/jira/browse/MESOS-7605
 Project: Mesos
  Issue Type: Improvement
  Components: containerization
Reporter: James DeFelice


Docker's {{run}} command supports a {{--hostname}} parameter which impacts 
container isolation, even in {{host}} network mode: (via 
https://docs.docker.com/engine/reference/run/)
{quote}
Even in host network mode a container has its own UTS namespace by default. As 
such --hostname is allowed in host network mode and will only change the 
hostname inside the container. Similar to --hostname, the --add-host, --dns, 
--dns-search, and --dns-option options can be used in host network mode.
{quote}
I see no evidence that UCR offers a similar isolation capability.

Related: the {{ContainerInfo}} protobuf has a {{hostname}} field which was 
initially added to support the Docker containerizer's use of the {{--hostname}} 
Docker {{run}} flag.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)


[jira] [Assigned] (MESOS-5995) Protobuf JSON deserialisation does not accept numbers formated as strings

2017-06-02 Thread Till Toenshoff (JIRA)

 [ 
https://issues.apache.org/jira/browse/MESOS-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Till Toenshoff reassigned MESOS-5995:
-

Assignee: (was: Tomasz Janiszewski)

Unassigned this issue due to discarded review and lack of new activity.

> Protobuf JSON deserialisation does not accept numbers formated as strings
> -
>
> Key: MESOS-5995
> URL: https://issues.apache.org/jira/browse/MESOS-5995
> Project: Mesos
>  Issue Type: Bug
>  Components: HTTP API
>Affects Versions: 1.0.0
>Reporter: Tomasz Janiszewski
>Priority: Critical
>
> Proto2 does not specify JSON mappings but 
> [Proto3|https://developers.google.com/protocol-buffers/docs/proto3#json] does 
> and it recommend to map 64bit numbers as a string. Unfortunately Mesos does 
> not accepts strings in places of uint64 and return 400 Bad 
> {quote}
> Request error Failed to convert JSON into Call protobuf: Not expecting a JSON 
> string for field 'value'.
> {quote}
> Is this by purpose or is this a bug?



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)