Dear Daniel and Rayson,
Thanks for your help! I created the JSV script which setup the CPU
core binding.
Daniel!
I replaced the sge_execd, sge_shepherd and loadsensor files on the
UltraViolet with Open Grid Scheduler binaries. I compiled the "Grid
Engine 2011.11" (http://gridscheduler.sourceforge.net/) relase which
content the hwloc feature. This is why the binding and topology
reporting is working. So for SGE master I am using SGE6.2u5 and for
execd Grid Engine 2011.11. It seems that they are compatible. I read
about this possibility on this website: http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html
Cheers,
Gabor
On 2012.01.12., at 7:44, Rayson Ho wrote:
The user guide has a good description of core binding:
http://docs.oracle.com/cd/E24901_01/doc.62/e21976/
chapter2.htm#BGBFIDHB
The first step is to make sure that core binding does improvement
performance, and we will then work on enabling it in a JSV.
Rayson
On 2012.01.12., at 13:14, Daniel Gruber wrote:
While core binding itself should work with such an topology (I never
tried it)
in 6.2u5, the reporting of the topology string will be wrong. As you
might noticed,
string based load values are just reported up to a length of 1024
bytes,
that means that with 1000 nodes not the full topology string arrives.
Hence the m_topology and m_topology_inuse as well as the selected
cores reported from "qstat -cb" are meaningless.
Unfortunately there is no free example script I'm aware of. SGE
6.2u6 contained
one but this is not free.
You should consider to use (set) the following JSV parameters:
binding_strategy, binding_type, binding_amount, binding_step,
binding_socket, binding_core, binding_exp_n,
binding_exp_socket<id>, binding_exp_core<id>
with jsv_set_param for example (for the general JSV usage there are
some examples in 6.2u5).
Often is is required to set the length of the explicit request to 0
(jsv_set_param binding_exp_n 0)
but doing it twice is not a good idea in SGE 6.2u5.
binding_strategy can be set to either: linear, linear_automatic,
striding, striding_automatic, explicit, or no_job_binding.
Usually you want to set "linear_automatic" or "striding_automatic"
which requires just to set "binding_amount" (binding_step).
This correspondents with "qsub -binding linear:N" and "qsub -binding
striding:S:N"
Using "linear" here requires to set a start core and a start socket,
which is on command line something like "qsub -binding linear:2:0,0".
Does a "qsub -binding linear:1" actually work with 6.2u5 on the
UltraViolet (meaning that it binds on the execution host without any
issues)?
Regards,
Daniel
Am 12.01.2012 um 07:25 schrieb Gabor Roczei:
Dear Reuti,
On 2012.01.11., at 22:19, Reuti wrote:
Hi,
Am 11.01.2012 um 22:02 schrieb Gabor Roczei:
I am looking for a JSV script which can assign CPU core binding
to parallel and array jobs in automatic way. Is there such
script somewhere on the web which can be used for SGE 6.2u5? I
will be very glad if someone could share with me this.
what are you looking for in detail? The built-in `qsub -
binding ...` is not sufficient, or you just want to automate this?
Our users did not use "-binding" parameter when they submit a job
and I would like to set it with JSV script. We have a SGI
UltraViolet 1000 SMP (ccNUMA) supercomputer which has this CPU
topology:
hl:m_topology=SCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCC
CSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCC
CCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCC
CCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCC

The processes context switch is too high and we would like to use
this automatic binding feaute to decrease it and speed up the jobs.
Gabor
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users
_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users