[ https://issues.apache.org/jira/browse/SINGA-201?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15346050#comment-15346050 ]
ASF subversion and git services commented on SINGA-201: ------------------------------------------------------- Commit 1ca8c638b132009e213fda8e02e77cc2d09fb824 in incubator-singa's branch refs/heads/master from [~ug93tad] [ https://git-wip-us.apache.org/repos/asf?p=incubator-singa.git;h=1ca8c63 ] SINGA-201 Error when running Mesos A bug was reported (https://issues.apache.org/jira/browse/SINGA-201) when launching SINGA on Mesos in fully distributed mode. The main cause was determined to be of ZeroMQ binding to the localhost. In fully distributed mode, SINGA on each node should be passed a `-host` flag specifying the public IP address of the local host. The Mesos scheduler is modified accordingly: 1. When a Mesos slave starts connecting to the master, it passes `--hostname` flag specifying its public IP address 2. The scheduler now sends to each executor command of the form: `singa -conf ./job.conf -singa_conf ./singa.conf -singa_job XX -host XX` > Error while running singa on mesos in fully distributed mode > ------------------------------------------------------------ > > Key: SINGA-201 > URL: https://issues.apache.org/jira/browse/SINGA-201 > Project: Singa > Issue Type: Bug > Environment: Linux > Reporter: Venkata Satish Katta > Assignee: Anh Dinh > Priority: Blocker > Labels: mesos, singa > > Log file created at: 2016/06/17 10:00:43 > Running on machine: ip-172-31-52-12 > Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg > I0617 10:00:43.202184 2751 zk_service.cc:215] GLOBAL_WATCHER connected to > zookeeper successfully! > W0617 10:00:43.203711 2742 zk_service.cc:109] zookeeper node /singa already > exists > W0617 10:00:43.205016 2742 zk_service.cc:109] zookeeper node /singa/app > already exists > W0617 10:00:43.206166 2742 zk_service.cc:109] zookeeper node > /singa/app/job-0000000017 already exists > W0617 10:00:43.207147 2742 zk_service.cc:109] zookeeper node > /singa/app/job-0000000017/group already exists > W0617 10:00:43.208237 2742 zk_service.cc:109] zookeeper node > /singa/app/job-0000000017/proc already exists > W0617 10:00:43.209300 2742 zk_service.cc:109] zookeeper node > /singa/app/job-0000000017/proc-lock already exists > F0617 10:00:43.862246 2742 socket.cc:98] Check failed: port != -1 (-1 vs. > -1) tcp://localhost:* -- This message was sent by Atlassian JIRA (v6.3.4#6332)