Hi Guangya,

Thanks for reply. I found one interesting log message.

 7410 master.cpp:5977] Removed slave
6a11063e-b8ff-43bd-86cf-e6eef0de06fd-S52 (192.168.0.178): a new slave
registered at the same address

Mostly because of this issue, the systems/slave nodes are getting
registered and de-registered to make a room for the next node. I can even
see this on
the UI interface, for some time one node got added and after some time that
will be replaced with the new slave node.

The above log is followed by the below log messages.


I1002 10:01:12.753865  7416 leveldb.cpp:343] Persisting action (18 bytes)
to leveldb took 104089ns
I1002 10:01:12.753885  7416 replica.cpp:679] Persisted action at 384
E1002 10:01:12.753891  7417 process.cpp:1912] Failed to shutdown socket
with fd 15: Transport endpoint is not connected
I1002 10:01:12.753988  7413 master.cpp:3930] Registered slave
6a11063e-b8ff-43bd-86cf-e6eef0de06fd-S62 at slave(1)@127.0.1.1:5051
(192.168.0.116) with cpus(*):8; mem(*):14930; disk(*):218578;
ports(*):[31000-32000]
I1002 10:01:12.754065  7413 master.cpp:1080] Slave
6a11063e-b8ff-43bd-86cf-e6eef0de06fd-S62 at slave(1)@127.0.1.1:5051
(192.168.0.116) disconnected
I1002 10:01:12.754072  7416 hierarchical.hpp:675] Added slave
6a11063e-b8ff-43bd-86cf-e6eef0de06fd-S62 (192.168.0.116) with cpus(*):8;
mem(*):14930; disk(*):218578; ports(*):[31000-32000] (allocated: )
I1002 10:01:12.754084  7413 master.cpp:2534] Disconnecting slave
6a11063e-b8ff-43bd-86cf-e6eef0de06fd-S62 at slave(1)@127.0.1.1:5051
(192.168.0.116)
E1002 10:01:12.754118  7417 process.cpp:1912] Failed to shutdown socket
with fd 16: Transport endpoint is not connected
I1002 10:01:12.754132  7413 master.cpp:2553] Deactivating slave
6a11063e-b8ff-43bd-86cf-e6eef0de06fd-S62 at slave(1)@127.0.1.1:5051
(192.168.0.116)
I1002 10:01:12.754237  7416 hierarchical.hpp:768] Slave
6a11063e-b8ff-43bd-86cf-e6eef0de06fd-S62 deactivated
I1002 10:01:12.754240  7413 replica.cpp:658] Replica received learned
notice for position 384
I1002 10:01:12.754360  7413 leveldb.cpp:343] Persisting action (20 bytes)
to leveldb took 95171ns
I1002 10:01:12.754395  7413 leveldb.cpp:401] Deleting ~2 keys from leveldb
took 20333ns
I1002 10:01:12.754406  7413 replica.cpp:679] Persisted action at 384


Thanks,
Pradeep



















On 2 October 2015 at 02:35, Guangya Liu <gyliu...@gmail.com> wrote:

> Hi Pradeep,
>
> Please check some of my questions in line.
>
> Thanks,
>
> Guangya
>
> On Fri, Oct 2, 2015 at 12:55 AM, Pradeep Kiruvale <
> pradeepkiruv...@gmail.com> wrote:
>
>> Hi All,
>>
>> I am new to Mesos. I have set up a Mesos cluster with 1 Master and 3
>> Slaves.
>>
>> One slave runs on the Master Node itself and Other slaves run on
>> different nodes. Here node means the physical boxes.
>>
>> I tried running the tasks by configuring one Node cluster. Tested the
>> task scheduling using mesos-execute, works fine.
>>
>> When I configure three Node cluster (1master and 3 slaves) and try to see
>> the resources on the master (in GUI) only the Master node resources are
>> visible.
>>  The other nodes resources are not visible. Some times visible but in a
>> de-actived state.
>>
> Can you please append some logs from mesos-slave and mesos-master? There
> should be some logs in either master or slave telling you what is wrong.
>
>>
>> *Please let me know what could be the reason. All the nodes are in the
>> same network. *
>>
>> When I try to schedule a task using
>>
>> /src/mesos-execute --master=192.168.0.102:5050 --name="cluster-test"
>> --command="/usr/bin/hackbench -s 4096 -l 10845760 -g 2 -f 2 -P"
>> --resources="cpus(*):3;mem(*):2560"
>>
>> The tasks always get scheduled on the same node. The resources from the
>> other nodes are not getting used to schedule the tasks.
>>
> Based on your previous question, there is only one node in your cluster,
> that's why other nodes are not available. We need first identify what is
> wrong with other three nodes first.
>
>>
>> I*s it required to register the frameworks from every slave node on the
>> Master?*
>>
> It is not required.
>
>>
>> *I have configured this cluster using the git-hub code.*
>>
>>
>> Thanks & Regards,
>> Pradeep
>>
>>
>

Reply via email to