Thanks, Mike. You were right.

I killed the stale process and am now able to run the slurmctld.

Adam

/

On Mon, Jun 15, 2015 at 11:51 AM, Michael Robbert <mrobb...@mines.edu>
wrote:

> Adam,
> That error looks like you already have a slurmctld running on this host.
> (or possibly some other program that is listening on the same TCP port).
>
> By default slurmctld binds to TCP/6817 and I don’t see a different port
> specified in your config file. That is probably fine, don’t change it if
> you don’t need to. Try running netstat to see what is currently listening
> on that port:
>
> # netstat -ltpn|grep 6817
> tcp        0      0 0.0.0.0:6817                0.0.0.0:*
>   LISTEN      11143/slurmctld
>
> It is likely a stale slurmctld process. If so just kill it and try to
> start again.
>
> Mike
>
> On Jun 15, 2015, at 9:02 AM, Cooper, Adam <adam_coo...@brown.edu> wrote:
>
>  Hi,
> I am new to SLURM and I have been tasked to install it on a cluster of 15
> servers.  Right now, I have just installed SLURM on the master, and hope to
> get the daemons running and scheduling jobs there before I try to get it
> working for the whole cluster. All of the machines are running Ubuntu
> 12.04. I have worked through some errors already; however, currently when I
> run:
>
> sudo slurmctld -Dv
>
> I get this out:
>
> slurmctld: pidfile not locked, assuming no running daemon
>
> slurmctld: slurmctld version 14.11.7 started on cluster cluster
>
> slurmctld: OpenSSL cryptographic signature plugin loaded
>
> slurmctld: preempt/none loaded
>
> slurmctld: ExtSensors NONE plugin loaded
>
> slurmctld: Accounting storage NOT INVOKED plugin loaded
>
> slurmctld: layouts: no layout to initialize
>
> slurmctld: topology NONE plugin loaded
>
> slurmctld: sched: Backfill scheduler plugin loaded
>
> slurmctld: route default plugin loaded
>
> slurmctld: layouts: loading entities/relations information
>
> slurmctld: Recovered state of 1 nodes
>
> slurmctld: Recovered information about 0 jobs
>
> slurmctld: Recovered state of 0 reservations
>
> slurmctld: State of 0 triggers recovered
>
> slurmctld: read_slurm_conf: backup_controller not specified.
>
> slurmctld: Running as primary controller
>
> *slurmctld: error: Error binding slurm stream socket: Address already in
> use*
>
> *slurmctld: fatal: slurm_init_msg_engine_addrname_port error Address
> already in use*
>
>
> By the way, I am running the daemon with root because my boss does not
> want me to create a separate 'slurm' user.  Any idea what might cause this
> fatal error?  I've attached an rtf of the current slurm configuration file
> (I've REDACTED some things to keep private), which I made using the online
> configuration tool.
>
> Please let me know any more relevant information that your need. Thank you
> in advance, and sorry for my lack of knowledge; this is very new work for
> me.
>
>
> Adam Cooper
>
> Brown University Computer Engineering '16
>
>
>
> /
>  <slurm_conf_current.rtf>
>
>
>

Reply via email to