[jira] [Commented] (MESOS-1565) Improve error message for external containerizer when containerizer_path results in command not found (status: 127)

Julien Eid (JIRA) Tue, 12 Aug 2014 09:35:24 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-1565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14094278#comment-14094278
 ]


Julien Eid commented on MESOS-1565:
-----------------------------------

[~tnachen] I've run into this issue as well and it doesn't have a Docker label, 
if you're interested.

> Improve error message for external containerizer when containerizer_path 
> results in command not found (status: 127)
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-1565
>                 URL: https://issues.apache.org/jira/browse/MESOS-1565
>             Project: Mesos
>          Issue Type: Improvement
>          Components: containerization
>    Affects Versions: 0.19.0
>            Reporter: Ben Whitehead
>
> When attempting to run mesos-slave with an external containerizer with a bad 
> containerizer_path the error message is misleading as to what the real 
> problem is.
> It would be nice if the containerizer code could detect exit code 127 and 
> have an error message to the effect of "Command not found: 
> <containerizer_path>"
> Below is a log file illustrating the scenario I ran into.
> {noformat}
> mesos-slave.sh --log_dir=/tmp/mesos/slave/log_dir 
> --master=zk://localhost:2181/mesos --work_dir=/tmp/mesos/slave/work_dir 
> --containerizer_path=/usr/local/bin/deimos --isolation=external
> I0707 17:09:00.525806 29499 logging.cpp:167] INFO level logging started!
> I0707 17:09:00.525997 29499 main.cpp:126] Build: 2014-06-12 18:09:59 by 
> ben.whitehead
> I0707 17:09:00.526013 29499 main.cpp:128] Version: 0.19.0
> I0707 17:09:00.526023 29499 main.cpp:131] Git tag: 0.19.0
> I0707 17:09:00.526033 29499 main.cpp:135] Git SHA: 
> 51e047524cf744ee257870eb479345646c0428ff
> I0707 17:09:00.526167 29499 main.cpp:149] Starting Mesos slave
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@712: Client 
> environment:zookeeper.version=zookeeper C client 3.4.5
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@716: Client 
> environment:host.name=xxxxxx
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@723: Client 
> environment:os.name=Linux
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@724: Client 
> environment:os.arch=3.11.10-17-desktop
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@725: Client 
> environment:os.version=#1 SMP PREEMPT Mon Jun 16 15:28:13 UTC 2014 (fba7c1f)
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@733: Client 
> environment:user.name=ben.whitehead
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@741: Client 
> environment:user.home=/home/ben.whitehead
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@log_env@753: Client 
> environment:user.dir=/home/ben.whitehead/tmp/mesos/mesos/build/bin
> 2014-07-07 17:09:00,526:29499(0x7fc0a747f700):ZOO_INFO@zookeeper_init@786: 
> Initiating client connection, host=localhost:2181 sessionTimeout=10000 
> watcher=0x7fc0ae7d59b0 sessionId=0 sessionPasswd=<null> 
> context=0x7fc09c0008e0 flags=0
> 2014-07-07 17:09:00,526:29499(0x7fc0a5c7c700):ZOO_INFO@check_events@1703: 
> initiated connection to server [127.0.0.1:2181]
> I0707 17:09:00.526564 29520 slave.cpp:143] Slave started on 1)@127.0.0.2:5051
> I0707 17:09:00.526713 29520 slave.cpp:255] Slave resources: cpus(*):8; 
> mem(*):14750; disk(*):221168; ports(*):[31000-32000]
> I0707 17:09:00.526747 29520 slave.cpp:283] Slave hostname: xxxxxx
> I0707 17:09:00.526757 29520 slave.cpp:284] Slave checkpoint: true
> I0707 17:09:00.527842 29518 state.cpp:33] Recovering state from 
> '/tmp/mesos/slave/work_dir/meta'
> I0707 17:09:00.528142 29516 status_update_manager.cpp:193] Recovering status 
> update manager
> I0707 17:09:00.528244 29517 external_containerizer.cpp:247] Recovering 
> containerizer
> 2014-07-07 17:09:00,544:29499(0x7fc0a5c7c700):ZOO_INFO@check_events@1750: 
> session establishment complete on server [127.0.0.1:2181], 
> sessionId=0x14712df6ba4000e, negotiated timeout=10000
> I0707 17:09:00.544852 29516 group.cpp:310] Group process ((4)@127.0.0.2:5051) 
> connected to ZooKeeper
> I0707 17:09:00.544888 29516 group.cpp:784] Syncing group operations: queue 
> size (joins, cancels, datas) = (0, 0, 0)
> I0707 17:09:00.544900 29516 group.cpp:382] Trying to create path '/mesos' in 
> ZooKeeper
> I0707 17:09:00.545446 29518 detector.cpp:135] Detected a new leader: (id='0')
> I0707 17:09:00.545524 29515 group.cpp:655] Trying to get 
> '/mesos/info_0000000000' in ZooKeeper
> I0707 17:09:00.545805 29517 detector.cpp:377] A new leading master 
> (UPID=master@127.0.0.2:5050) is detected
> Failed to perform recovery: Recover failed: External containerizer failed 
> (status: 127)
> To remedy this do as follows:
> Step 1: rm -f /tmp/mesos/slave/work_dir/meta/slaves/latest
>         This ensures slave doesn't recover old live executors.
> Step 2: Restart the slave.
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

[jira] [Commented] (MESOS-1565) Improve error message for external containerizer when containerizer_path results in command not found (status: 127)

Reply via email to