Dear Daniel and Rayson,

Thanks for your help! I created the JSV script which setup the CPU core binding.

Daniel!

I replaced the sge_execd, sge_shepherd and loadsensor files on the UltraViolet with Open Grid Scheduler binaries. I compiled the "Grid Engine 2011.11" (http://gridscheduler.sourceforge.net/) relase which content the hwloc feature. This is why the binding and topology reporting is working. So for SGE master I am using SGE6.2u5 and for execd Grid Engine 2011.11. It seems that they are compatible. I read about this possibility on this website: http://gridscheduler.sourceforge.net/projects/hwloc/GridEnginehwloc.html

Cheers,

   Gabor

On 2012.01.12., at 7:44, Rayson Ho wrote:

The user guide has a good description of core binding:

http://docs.oracle.com/cd/E24901_01/doc.62/e21976/ chapter2.htm#BGBFIDHB

The first step is to make sure that core binding does improvement
performance, and we will then work on enabling it in a JSV.

Rayson


On 2012.01.12., at 13:14, Daniel Gruber wrote:

While core binding itself should work with such an topology (I never tried it) in 6.2u5, the reporting of the topology string will be wrong. As you might noticed, string based load values are just reported up to a length of 1024 bytes,
that means that with 1000 nodes not the full topology string arrives.
Hence the m_topology and m_topology_inuse as well as the selected
cores reported from "qstat -cb" are meaningless.

Unfortunately there is no free example script I'm aware of. SGE 6.2u6 contained
one but this is not free.

You should consider to use (set) the following JSV parameters:

binding_strategy, binding_type, binding_amount, binding_step, binding_socket, binding_core, binding_exp_n,
binding_exp_socket<id>, binding_exp_core<id>

with jsv_set_param for example (for the general JSV usage there are some examples in 6.2u5).

Often is is required to set the length of the explicit request to 0 (jsv_set_param binding_exp_n 0)
but doing it twice is not a good idea in SGE 6.2u5.

binding_strategy can be set to either: linear, linear_automatic, striding, striding_automatic, explicit, or no_job_binding.

Usually you want to set "linear_automatic" or "striding_automatic" which requires just to set "binding_amount" (binding_step). This correspondents with "qsub -binding linear:N" and "qsub -binding striding:S:N"

Using "linear" here requires to set a start core and a start socket, which is on command line something like "qsub -binding linear:2:0,0".

Does a "qsub -binding linear:1" actually work with 6.2u5 on the UltraViolet (meaning that it binds on the execution host without any issues)?

Regards,

Daniel


Am 12.01.2012 um 07:25 schrieb Gabor Roczei:

Dear Reuti,

On 2012.01.11., at 22:19, Reuti wrote:

Hi,

Am 11.01.2012 um 22:02 schrieb Gabor Roczei:

I am looking for a JSV script which can assign CPU core binding to parallel and array jobs in automatic way. Is there such script somewhere on the web which can be used for SGE 6.2u5? I will be very glad if someone could share with me this.

what are you looking for in detail? The built-in `qsub - binding ...` is not sufficient, or you just want to automate this?

Our users did not use "-binding" parameter when they submit a job and I would like to set it with JSV script. We have a SGI UltraViolet 1000 SMP (ccNUMA) supercomputer which has this CPU topology:

hl:m_topology=SCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCC CSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCC CCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCC CCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCC CCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCCSCCCCCC

The processes context switch is too high and we would like to use this automatic binding feaute to decrease it and speed up the jobs.

Gabor


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users


_______________________________________________
users mailing list
[email protected]
https://gridengine.org/mailman/listinfo/users

Reply via email to