[jira] [Updated] (MESOS-7617) UCR cannot read docker images containing long file paths
[ https://issues.apache.org/jira/browse/MESOS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7617: -- Affects Version/s: 1.3.0 1.1.2 1.2.0 > UCR cannot read docker images containing long file paths > > > Key: MESOS-7617 > URL: https://issues.apache.org/jira/browse/MESOS-7617 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.1.2, 1.2.0, 1.3.0, 1.3.1 >Reporter: Chun-Hung Hsiao > Labels: containerizer, triaged > > The latest Docker uses go 1.7.5 > (https://github.com/moby/moby/blob/master/CHANGELOG.md#contrib-1), in which > the {{archive/tar}} package has a bug that cannot handle file paths longer > than 100 characters (https://github.com/golang/go/issues/17630). As a result, > Docker will generate images containing ill-formed tar files (details below) > when there are long paths. Docker itself understands the ill-formed image > fine, but a standard tar program will interpret the image as if all files > with long paths are placed under the root directory > (https://github.com/moby/moby/issues/29360). > This bug has been fixed in go 1.8, but since Docker is still using the bugged > version, we might need to handle these ill-formed images created by Dcoker > utilities. > NOTE: It is confirmed that the {{archive/tar}} package in go 1.8 cannot > correctly extract the ill-formed tar files, but the one in go 1.7.5 could. > Details: the {{archive/tar}} package uses {{USTAR}} format to handle files > with 100+-character-long paths (by only putting file name in the {{name}} > field and the path in the {{prefix}} field in the tar header), but uses > {{OLDGNU}}'s magic string, which does not understand the {{prefix}} field, so > a standard tar program will extract such files under the current directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (MESOS-7617) UCR cannot read docker images containing long file paths
[ https://issues.apache.org/jira/browse/MESOS-7617?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jie Yu updated MESOS-7617: -- Affects Version/s: 1.3.1 > UCR cannot read docker images containing long file paths > > > Key: MESOS-7617 > URL: https://issues.apache.org/jira/browse/MESOS-7617 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.1.2, 1.2.0, 1.3.0, 1.3.1 >Reporter: Chun-Hung Hsiao > Labels: containerizer, triaged > > The latest Docker uses go 1.7.5 > (https://github.com/moby/moby/blob/master/CHANGELOG.md#contrib-1), in which > the {{archive/tar}} package has a bug that cannot handle file paths longer > than 100 characters (https://github.com/golang/go/issues/17630). As a result, > Docker will generate images containing ill-formed tar files (details below) > when there are long paths. Docker itself understands the ill-formed image > fine, but a standard tar program will interpret the image as if all files > with long paths are placed under the root directory > (https://github.com/moby/moby/issues/29360). > This bug has been fixed in go 1.8, but since Docker is still using the bugged > version, we might need to handle these ill-formed images created by Dcoker > utilities. > NOTE: It is confirmed that the {{archive/tar}} package in go 1.8 cannot > correctly extract the ill-formed tar files, but the one in go 1.7.5 could. > Details: the {{archive/tar}} package uses {{USTAR}} format to handle files > with 100+-character-long paths (by only putting file name in the {{name}} > field and the path in the {{prefix}} field in the tar header), but uses > {{OLDGNU}}'s magic string, which does not understand the {{prefix}} field, so > a standard tar program will extract such files under the current directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7095) Basic make check from getting started link fails
[ https://issues.apache.org/jira/browse/MESOS-7095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035652#comment-16035652 ] Benjamin Mahler commented on MESOS-7095: [~tillt] does the getting started guide need any updates related to this so that users don't hit it? http://mesos.apache.org/gettingstarted/ > Basic make check from getting started link fails > > > Key: MESOS-7095 > URL: https://issues.apache.org/jira/browse/MESOS-7095 > Project: Mesos > Issue Type: Bug > Components: build >Reporter: Alec Bruns > > {*** Aborted at 1486657215 (unix time) try "date -d @1486657215" if you are > using GNU date *** PC: @0x1080b7367 apr_pool_create_ex *** SIGSEGV > (@0x30) received by PID 25167 (TID 0x7fffbdd073c0) stack trace: ***} > \{@ 0x7fffb50c7bba _sigtramp > @\{ 0x72c0517 (unknown)\} > @0x107eaa13a svn_pool_create_ex > @0x107691d6e svn::diff() > @0x107691042 SVNTest_DiffPatch_Test::TestBody() > @0x1077026ba > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @0x1076b3ad7 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @0x1076b3985 testing::Test::Run() > @0x1076b54f8 testing::TestInfo::Run() > @0x1076b6867 testing::TestCase::Run() > @0x1076c65dc testing::internal::UnitTestImpl::RunAllTests() > @0x1077033da > testing::internal::HandleSehExceptionsInMethodIfSupported<>() > @0x1076c6007 > testing::internal::HandleExceptionsInMethodIfSupported<>() > @0x1076c5ed8 testing::UnitTest::Run() > @0x1074d55c1 RUN_ALL_TESTS() > @0x1074d5580 main > @ 0x7fffb4eba255 start > make[6]: *** [check-local] Segmentation fault: 11 > make[5]: *** [check-am] Error 2 make[4]: *** [check-recursive] Error 1 > make[3]: *** [check] Error 2 make[2]: *** [check-recursive] Error 1 > make[1]: *** [check] Error 2 make: *** [check-recursive] Error 1 > make: *** [check-recursive] Error 1 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-1606) Slave failed to checkpoint on Mac OS X
[ https://issues.apache.org/jira/browse/MESOS-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035631#comment-16035631 ] Neil Conway commented on MESOS-1606: Perhaps a disk I/O error, e.g., due to a flaky disk? > Slave failed to checkpoint on Mac OS X > -- > > Key: MESOS-1606 > URL: https://issues.apache.org/jira/browse/MESOS-1606 > Project: Mesos > Issue Type: Bug > Components: agent > Environment: Mac OS X, Darwin Kernel Version 13.3.0 >Reporter: Zuyu Zhang > > {noformat} > This bug happens to test_framework and LowLevelSchedulerLibprocess as well. > [ RUN ] ExamplesTest.LowLevelSchedulerPthread > Using temporary directory '/tmp/ExamplesTest_LowLevelSchedulerPthread_SCL6Al' > Enabling authentication for the scheduler > I0715 19:03:59.296200 2019271440 scheduler.cpp:132] Version: 0.20.0 > I0715 19:03:59.300429 2019271440 leveldb.cpp:176] Opened db in 1982us > I0715 19:03:59.300900 2019271440 leveldb.cpp:183] Compacted db in 447us > I0715 19:03:59.300946 2019271440 leveldb.cpp:198] Created db iterator in 27us > I0715 19:03:59.300978 2019271440 leveldb.cpp:204] Seeked to beginning of db > in 16us > I0715 19:03:59.301007 2019271440 leveldb.cpp:273] Iterated through 0 keys in > the db in 20us > I0715 19:03:59.301053 2019271440 replica.cpp:741] Replica recovered with log > positions 0 -> 0 with 1 holes and 0 unlearned > I0715 19:03:59.301713 222965760 recover.cpp:425] Starting replica recovery > I0715 19:03:59.301914 222965760 recover.cpp:451] Replica is in EMPTY status > I0715 19:03:59.302671 221892608 replica.cpp:638] Replica in EMPTY status > received a broadcasted recover request > I0715 19:03:59.302781 224575488 recover.cpp:188] Received a recover response > from a replica in EMPTY status > I0715 19:03:59.303050 225112064 recover.cpp:542] Updating replica status to > STARTING > I0715 19:03:59.303432 222965760 leveldb.cpp:306] Persisting metadata (8 > bytes) to leveldb took 298us > I0715 19:03:59.303475 222965760 replica.cpp:320] Persisted replica status to > STARTING > I0715 19:03:59.303540 221356032 recover.cpp:451] Replica is in STARTING status > I0715 19:03:59.303797 224575488 master.cpp:288] Master > 20140715-190359-16777343-64313-60122 (localhost) started on 127.0.0.1:64313 > I0715 19:03:59.303848 224575488 master.cpp:325] Master only allowing > authenticated frameworks to register > I0715 19:03:59.303865 224575488 master.cpp:332] Master allowing > unauthenticated slaves to register > I0715 19:03:59.303884 224575488 credentials.hpp:36] Loading credentials for > authentication from > '/tmp/ExamplesTest_LowLevelSchedulerPthread_SCL6Al/credentials' > W0715 19:03:59.303961 224575488 credentials.hpp:51] Permissions on > credentials file > '/tmp/ExamplesTest_LowLevelSchedulerPthread_SCL6Al/credentials' are too open. > It is recommended that your credentials file is NOT accessible by others. > I0715 19:03:59.304028 224575488 master.cpp:359] Authorization enabled > I0715 19:03:59.304379 223502336 replica.cpp:638] Replica in STARTING status > received a broadcasted recover request > I0715 19:03:59.304505 2019271440 containerizer.cpp:124] Using isolation: > posix/cpu,posix/mem > I0715 19:03:59.304666 223502336 recover.cpp:188] Received a recover response > from a replica in STARTING status > I0715 19:03:59.304805 223502336 recover.cpp:542] Updating replica status to > VOTING > I0715 19:03:59.305186 223502336 leveldb.cpp:306] Persisting metadata (8 > bytes) to leveldb took 214us > I0715 19:03:59.305219 223502336 replica.cpp:320] Persisted replica status to > VOTING > I0715 19:03:59.305250 223502336 recover.cpp:556] Successfully joined the > Paxos group > I0715 19:03:59.305361 223502336 recover.cpp:440] Recover process terminated > I0715 19:03:59.305927 224038912 slave.cpp:168] Slave started on > 1)@127.0.0.1:64313 > I0715 19:03:59.306221 224038912 slave.cpp:279] Slave resources: cpus(*):4; > mem(*):7168; disk(*):470714; ports(*):[31000-32000] > I0715 19:03:59.306234 2019271440 containerizer.cpp:124] Using isolation: > posix/cpu,posix/mem > I0715 19:03:59.306248 223502336 master.cpp:1128] The newly elected leader is > master@127.0.0.1:64313 with id 20140715-190359-16777343-64313-60122 > I0715 19:03:59.306269 223502336 master.cpp:1141] Elected as the leading > master! > I0715 19:03:59.306293 223502336 master.cpp:959] Recovering from registrar > I0715 19:03:59.306395 225112064 registrar.cpp:313] Recovering registrar > I0715 19:03:59.306617 221892608 log.cpp:656] Attempting to start the writer > I0715 19:03:59.306952 224575488 slave.cpp:168] Slave started on > 2)@127.0.0.1:64313 > I0715 19:03:59.307158 224575488 slave.cpp:279] Slave resources: cpus(*):4; > mem(*):7168; disk(*):470714; ports(*):[31000-32000] > I0715 19:03:59.307207 222965760 replica.cpp:474] Replica received
[jira] [Created] (MESOS-7617) UCR cannot read docker images containing long file paths
Chun-Hung Hsiao created MESOS-7617: -- Summary: UCR cannot read docker images containing long file paths Key: MESOS-7617 URL: https://issues.apache.org/jira/browse/MESOS-7617 Project: Mesos Issue Type: Bug Components: containerization Reporter: Chun-Hung Hsiao The latest Docker uses go 1.7.5 (https://github.com/moby/moby/blob/master/CHANGELOG.md#contrib-1), in which the {{archive/tar}} package has a bug that cannot handle file paths longer than 100 characters (https://github.com/golang/go/issues/17630). As a result, Docker will generate images containing ill-formed tar files (details below) when there are long paths. Docker itself understands the ill-formed image fine, but a standard tar program will interpret the image as if all files with long paths are placed under the root directory (https://github.com/moby/moby/issues/29360). This bug has been fixed in go 1.8, but since Docker is still using the bugged version, we might need to handle these ill-formed images created by Dcoker utilities. NOTE: It is confirmed that the {{archive/tar}} package in go 1.8 cannot correctly extract the ill-formed tar files, but the one in go 1.7.5 could. Details: the {{archive/tar}} package uses {{USTAR}} format to handle files with 100+-character-long paths (by only putting file name in the {{name}} field and the path in the {{prefix}} field in the tar header), but uses {{OLDGNU}}'s magic string, which does not understand the {{prefix}} field, so a standard tar program will extract such files under the current directory. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-7458) webui display of framework resources is confusing
[ https://issues.apache.org/jira/browse/MESOS-7458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035363#comment-16035363 ] Vinod Kone commented on MESOS-7458: --- [~haosd...@gmail.com] Are you actively working on this? We need a fix for this ASAP, so if you do not have cycles, I would like to take over. Thanks. > webui display of framework resources is confusing > - > > Key: MESOS-7458 > URL: https://issues.apache.org/jira/browse/MESOS-7458 > Project: Mesos > Issue Type: Bug > Components: webui >Reporter: Neil Conway >Assignee: haosdent > Labels: mesosphere > Attachments: Screen Shot 2017-05-04 at 11.15.12 AM.png, Screen Shot > 2017-05-04 at 11.15.25 AM.png > > > In the webui, the list of frameworks displays the {{used_resources}} for each > framework. When you click on the framework to access the per-framework page, > the resources displayed are the *total* resources (the {{resources}} key in > state.json, which is {{used_resources}} + {{offered_resources}}). This is > confusing in situations when the offered resources are very different from > the used resources. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (MESOS-7610) Support domains in master and agent
[ https://issues.apache.org/jira/browse/MESOS-7610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16035339#comment-16035339 ] Neil Conway edited comment on MESOS-7610 at 6/2/17 8:10 PM: https://reviews.apache.org/r/59761/ https://reviews.apache.org/r/59762/ was (Author: neilc): https://reviews.apache.org/r/59761/ > Support domains in master and agent > --- > > Key: MESOS-7610 > URL: https://issues.apache.org/jira/browse/MESOS-7610 > Project: Mesos > Issue Type: Improvement >Reporter: Neil Conway >Assignee: Neil Conway > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (MESOS-7608) Protobuf definitions for domains
[ https://issues.apache.org/jira/browse/MESOS-7608?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Neil Conway updated MESOS-7608: --- Description: (was: https://reviews.apache.org/r/59759/) https://reviews.apache.org/r/59759/ > Protobuf definitions for domains > > > Key: MESOS-7608 > URL: https://issues.apache.org/jira/browse/MESOS-7608 > Project: Mesos > Issue Type: Improvement >Reporter: Neil Conway >Assignee: Neil Conway > Labels: mesosphere > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7616) Consider supporting changes to agent's domain without full drain.
Neil Conway created MESOS-7616: -- Summary: Consider supporting changes to agent's domain without full drain. Key: MESOS-7616 URL: https://issues.apache.org/jira/browse/MESOS-7616 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Assignee: Neil Conway In the initial review chain, any change to an agent's domain requires a full drain. This is simple and straightforward, but it makes it more difficult for operators to opt-in to using fault domains. We should consider allowing agents to transition from "no configured domain" to "configured domain" without requiring an agent drain. This has some complications, however: e.g., without an API for communicating changes in an agent's configuration to frameworks, they might not realize that an agent's domain has changed. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7615) Report registration errors to agents.
Neil Conway created MESOS-7615: -- Summary: Report registration errors to agents. Key: MESOS-7615 URL: https://issues.apache.org/jira/browse/MESOS-7615 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Priority: Minor Agent registration attempts might be ignored by the master for various reasons, such as: * the agent's version number is malformed * the agent has a configured domain but the master does not * agent registration message fails validation When this occurs, the master writes a warning message to its log, but it would also be nice for it to send the agent a warning message; this would let the agent understand/log why it hasn't successfully registered. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7614) Only offer resources on remote agents to region-aware frameworks
Neil Conway created MESOS-7614: -- Summary: Only offer resources on remote agents to region-aware frameworks Key: MESOS-7614 URL: https://issues.apache.org/jira/browse/MESOS-7614 Project: Mesos Issue Type: Improvement Components: allocation Reporter: Neil Conway Assignee: Neil Conway If both master and agent are configured with domains, frameworks that are not region-aware should not receive offers for resources on agents in remote regions. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7613) Unit test for master behavior with mixed regions
Neil Conway created MESOS-7613: -- Summary: Unit test for master behavior with mixed regions Key: MESOS-7613 URL: https://issues.apache.org/jira/browse/MESOS-7613 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Assignee: Neil Conway It would be nice to write unit tests to check that: * A standby master joins the Zk group if it has the same region and zone as the leading master * A standby master joins the Zk group if it has the same region as the leading master but a different zone * A standby master joins the Zk group if it has no configured domain but the leading master has a configured domain. * A standby master joins the Zk group if it has a configured domain but the leading master does not have a configured domain. * A standby master aborts with an error message if it is configured to use a different region than the leading master. Unfortunately, we cannot easily test this scenario due to MESOS-2976. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7612) Prevent agent with misconfigured domain from registering
Neil Conway created MESOS-7612: -- Summary: Prevent agent with misconfigured domain from registering Key: MESOS-7612 URL: https://issues.apache.org/jira/browse/MESOS-7612 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Assignee: Neil Conway We expect that the master's domain will be configured before the agent's domain. Hence, if an agent with configured domain attempts to register with a master that has no configured domain, its registration attempt should be ignored. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7611) Prevent master from joining mixed-region cluster
Neil Conway created MESOS-7611: -- Summary: Prevent master from joining mixed-region cluster Key: MESOS-7611 URL: https://issues.apache.org/jira/browse/MESOS-7611 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Assignee: Neil Conway If a master with configured region X joins a cluster where the leading master has configured region Y, it should abort with an error message. This enforces the invariant that all the masters in the same Mesos cluster are configured to use the same region. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7610) Support domains in master and agent
Neil Conway created MESOS-7610: -- Summary: Support domains in master and agent Key: MESOS-7610 URL: https://issues.apache.org/jira/browse/MESOS-7610 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Assignee: Neil Conway -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7609) Protobuf definitions for region-aware framework capability
Neil Conway created MESOS-7609: -- Summary: Protobuf definitions for region-aware framework capability Key: MESOS-7609 URL: https://issues.apache.org/jira/browse/MESOS-7609 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Assignee: Neil Conway -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7608) Protobuf definitions for domains
Neil Conway created MESOS-7608: -- Summary: Protobuf definitions for domains Key: MESOS-7608 URL: https://issues.apache.org/jira/browse/MESOS-7608 Project: Mesos Issue Type: Improvement Reporter: Neil Conway Assignee: Neil Conway -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7607) Support for first-class fault domains.
Neil Conway created MESOS-7607: -- Summary: Support for first-class fault domains. Key: MESOS-7607 URL: https://issues.apache.org/jira/browse/MESOS-7607 Project: Mesos Issue Type: Epic Reporter: Neil Conway Assignee: Neil Conway Mesos should support a first-class notion of "fault domains", which effectively provide a common vocabulary for describing the region and zone where a node (either master or agent) is located. Design doc: https://drive.google.com/open?id=1gEugdkLRbBsqsiFv3urRPRNrHwUC-i1HwfFfHR_MvC8 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (MESOS-6556) Hostname support for the network/cni isolator.
[ https://issues.apache.org/jira/browse/MESOS-6556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16034991#comment-16034991 ] James DeFelice commented on MESOS-6556: --- {{hostname}} is only applied when there are container networks present. When using host-mode networking, the UTS namespace is not isolated and {{hostname}} is not applied to the container. Tracking via https://issues.apache.org/jira/browse/MESOS-7605 > Hostname support for the network/cni isolator. > -- > > Key: MESOS-6556 > URL: https://issues.apache.org/jira/browse/MESOS-6556 > Project: Mesos > Issue Type: Improvement > Components: containerization >Reporter: James Peach >Assignee: James Peach >Priority: Minor > Fix For: 1.2.0 > > > -Add a {{namespace/uts}} isolator for doing UTS namespace isolation without > using the CNI isolator.- > Update the {{network/cni}} isolator to set the hostname specified by the task > info. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (MESOS-7606) Hierarchical allocator seems to perform redundant activation / deactivation of newly added frameworks.
Alexander Rukletsov created MESOS-7606: -- Summary: Hierarchical allocator seems to perform redundant activation / deactivation of newly added frameworks. Key: MESOS-7606 URL: https://issues.apache.org/jira/browse/MESOS-7606 Project: Mesos Issue Type: Bug Components: allocation Affects Versions: 1.3.0 Reporter: Alexander Rukletsov According to the logs, {noformat} Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] Deactivated framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] Activated framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 {noformat} the built-in allocator ensures that upon addition, a framework is deactivated first and then activates it again. This seems to be redundant: if a sorter client should always start deactivated, we should not call deactivate on it but rather add it in a way that it is deactivated. This will naturally eliminate the logging issue. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (MESOS-7601) Some container launch failures are mistakenly treated as errors.
[ https://issues.apache.org/jira/browse/MESOS-7601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alexander Rukletsov reassigned MESOS-7601: -- Assignee: Alexander Rukletsov https://reviews.apache.org/r/59746/ > Some container launch failures are mistakenly treated as errors. > > > Key: MESOS-7601 > URL: https://issues.apache.org/jira/browse/MESOS-7601 > Project: Mesos > Issue Type: Bug > Components: containerization >Affects Versions: 1.3.0 >Reporter: Alexander Rukletsov >Assignee: Alexander Rukletsov > Labels: containerizer, mesosphere > > I've observed a case when a scheduler stops (i.e. calls TEARDOWN) while some > of its tasks are being launched. While this is a valid behaviour, the agent > prints an error and increased container launch errors metrics. > Below are log excerpts for such framework, > {{6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092}}. > *Master log* > {noformat} > [centos@ip-172-31-6-200 ~]$ journalctl _PID=29716 --since "2 hours ago" > --no-pager | grep > "6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092" > Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:58.226218 29724 master.cpp:6072] Updating > info for framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:58.226356 29728 hierarchical.cpp:274] Added > framework 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:58.226405 29728 hierarchical.cpp:379] > Deactivated framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:58.228570 29728 hierarchical.cpp:343] > Activated framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:58.246068 29721 master.cpp:7105] Sending 1 > offers to framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > (TeraValidate) at > scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531 > Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:58.247851 29721 master.cpp:7194] Sending 1 > inverse offers to framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > (TeraValidate) at > scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531 > Jun 01 11:32:58 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:58.912937 29728 master.cpp:4806] Processing > DECLINE call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509464 ] for > framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > (TeraValidate) at > scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531 > Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:59.804184 29727 master.cpp:7105] Sending 2 > offers to framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > (TeraValidate) at > scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531 > Jun 01 11:32:59 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:32:59.804411 29727 master.cpp:7194] Sending 2 > inverse offers to framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > (TeraValidate) at > scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531 > Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:33:01.248924 29721 master.cpp:7105] Sending 2 > offers to framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > (TeraValidate) at > scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531 > Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:33:01.249289 29721 master.cpp:7194] Sending 2 > inverse offers to framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driver-20170601113252-0092 > (TeraValidate) at > scheduler-3b84262b-e1a6-47a8-ac0f-00af50b24f5c@172.31.7.83:45531 > Jun 01 11:33:01 ip-172-31-6-200.us-west-2.compute.internal > mesos-master[29716]: I0601 11:33:01.249724 29721 master.cpp:3851] Processing > ACCEPT call for offers: [ 92434aef-27da-4fd1-a5c4-b286d640d5b3-O509469 ] on > agent 36a25adb-4ea2-49d3-a195-448cff1dc146-S35 at slave(1)@172.31.13.122:5051 > (172.31.13.122) for framework > 6dd898d6-7f3a-406c-8ead-24b4d55ed262-0018-driv
[jira] [Created] (MESOS-7605) UCR doesn't isolate uts namespace w/ host networking
James DeFelice created MESOS-7605: - Summary: UCR doesn't isolate uts namespace w/ host networking Key: MESOS-7605 URL: https://issues.apache.org/jira/browse/MESOS-7605 Project: Mesos Issue Type: Improvement Components: containerization Reporter: James DeFelice Docker's {{run}} command supports a {{--hostname}} parameter which impacts container isolation, even in {{host}} network mode: (via https://docs.docker.com/engine/reference/run/) {quote} Even in host network mode a container has its own UTS namespace by default. As such --hostname is allowed in host network mode and will only change the hostname inside the container. Similar to --hostname, the --add-host, --dns, --dns-search, and --dns-option options can be used in host network mode. {quote} I see no evidence that UCR offers a similar isolation capability. Related: the {{ContainerInfo}} protobuf has a {{hostname}} field which was initially added to support the Docker containerizer's use of the {{--hostname}} Docker {{run}} flag. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (MESOS-5995) Protobuf JSON deserialisation does not accept numbers formated as strings
[ https://issues.apache.org/jira/browse/MESOS-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Till Toenshoff reassigned MESOS-5995: - Assignee: (was: Tomasz Janiszewski) Unassigned this issue due to discarded review and lack of new activity. > Protobuf JSON deserialisation does not accept numbers formated as strings > - > > Key: MESOS-5995 > URL: https://issues.apache.org/jira/browse/MESOS-5995 > Project: Mesos > Issue Type: Bug > Components: HTTP API >Affects Versions: 1.0.0 >Reporter: Tomasz Janiszewski >Priority: Critical > > Proto2 does not specify JSON mappings but > [Proto3|https://developers.google.com/protocol-buffers/docs/proto3#json] does > and it recommend to map 64bit numbers as a string. Unfortunately Mesos does > not accepts strings in places of uint64 and return 400 Bad > {quote} > Request error Failed to convert JSON into Call protobuf: Not expecting a JSON > string for field 'value'. > {quote} > Is this by purpose or is this a bug? -- This message was sent by Atlassian JIRA (v6.3.15#6346)