In my cluster with slurm 2.6.2 I'm having a problem to run xstata (it's the graphical version of stata).
If I launch directly xstata on the master or on any node as normal user, everything is fine. If I lauch xstata with srun (just srun xstata) nothings happens (no output, nothing special in the slurm log) and the command terminate almost immediately. I'm able to launch other graphical application. I have tried as well to launch xstata with --slurmd-debug : srun --slurmd-debug=4 xstata slurmd[node01]: debug level = 6 slurmd[node01]: Uncached user/gid: sagon/1000 slurmd[node01]: IO handler started pid=105416 slurmd[node01]: task 0 (105421) started 2013-10-16T15:44:54 slurmd[node01]: Setting slurmstepd oom_adj to -1000 slurmd[node01]: adding task 0 pid 105421 on node 0 to jobacct slurmd[node01]: 105421 mem size 1008 200024 time 0(0+0) slurmd[node01]: _get_sys_interface_freq_line: filename = /sys/devices/system/cpu/cpu0/cpufreq/cpuinfo_cur_freq slurmd[node01]: cpu 0 freq= 2201000 slurmd[node01]: Task average frequency = 2201000 pid 105421 mem size 1008 200024 time 0(0+0) slurmd[node01]: energycounted = 0 slurmd[node01]: getjoules_task energy = 0 slurmd[node01]: Sending launch resp rc=0 slurmd[node01]: auth plugin for Munge (http://code.google.com/p/munge/) loaded slurmd[node01]: Handling REQUEST_INFO slurmd[node01]: Handling REQUEST_SIGNAL_CONTAINER slurmd[node01]: _handle_signal_container for step=48997.0 uid=0 signal=995 slurmd[node01]: Uncached user/gid: sagon/1000 slurmd[node01]: mpi type = (null) slurmd[node01]: Using mpi/openmpi slurmd[node01]: _set_limit: conf setrlimit RLIMIT_CPU no change in value: 18446744073709551615 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_FSIZE no change in value: 18446744073709551615 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_DATA no change in value: 18446744073709551615 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_STACK no change in value: 18446744073709551615 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_CORE no change in value: 0 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_RSS no change in value: 18446744073709551615 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_NPROC no change in value: 18446744073709551615 slurmd[node01]: _set_limit: RLIMIT_NOFILE : max:8192 cur:8192 req:1024 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_NOFILE succeeded slurmd[node01]: _set_limit: conf setrlimit RLIMIT_MEMLOCK no change in value: 18446744073709551615 slurmd[node01]: _set_limit: conf setrlimit RLIMIT_AS no change in value: 18446744073709551615 slurmd[node01]: removing task 0 pid 105421 from jobacct slurmd[node01]: task 0 (105421) exited with exit code 0. slurmd[node01]: Aggregated 1 task exit messages slurmd[node01]: killing process 105424 (inherited_task) with signal 9 slurmd[node01]: killing process 105424 (inherited_task) with signal 9 slurmd[node01]: Sending SIGKILL to pgid 105416 slurmd[node01]: Waiting for IO slurmd[node01]: Closing debug channel Thanks for your ideas!
