Hi Prasanna, 1. To run singa in cluster, you need to set the zookeeper location in "conf/singa.conf" file. Just replace "zookeeper_host" field with the zookeeper service you are using.
2. To run on a GPU, please make sure that job configure file on all GPU nodes has following field: gpu: <gpu id> If you need to use multiple GPUs in a single node, please add all of them in the configure file, e.g. : gpu: 0 gpu: 1 ... When it is running on GPU, you will see following info from log files: Worker (group = XXX, id = XXX) start on GPU XXX Regards, Sheng On Mon, Jul 11, 2016 at 4:04 AM, Prasanna Balaprakash <[email protected]> wrote: > Dear developers, > > I am trying to run SINGA in a cluster environment with ~100 hybrid > (CPU+GPU) nodes. > > > I started with single node experiment. > > As per the instruction, in my COBALT job script, I use "cat > $COBALT_NODEFILE > conf/hostfile”, where $COBALT_NODEFILE in the COBALT > will give the list of nodes allocated. > > I am not sure how to set the zookeeper location! > > Also, how to verify if GPU is used: > > E0710 18:39:23.837704 72213 cluster.cc:50] proc #0 -> localhost:0 (pid = > 72213) > E0710 18:39:23.898723 72241 server.cc:64] Server (group = 0, id = 0) start > E0710 18:39:24.898967 72242 worker.cc:79] Worker (group = 0, id = 0) start > on CPU > > From this log file it seems only CPU is on used. > > Thanks > Prasanna > > >
