Re: [OMPI users] Hyper-thread architecture effect on MPI jobs

Saygin Arkan Thu, 12 Aug 2010 08:23:32 -0400

Hi again,

I think the problem is solved. Thanks to Gus, I've tried
mpirun -mca mpi_paffinity_alone 1
while running the program, and I've made a quick search on that, it assures
that every program works on a specific core I guess.
(correct me if I'm wrong).
I've ran over 20 tests, and now it works fine.


Thanks a lot,

Saygin

On Thu, Aug 12, 2010 at 11:39 AM, Saygin Arkan <saygen...@gmail.com> wrote:

> Hi Gus,
>
> 1 - first of all, turning off hyper-threading is not an option. And it
> gives pretty good results if I can find a way to arrange the cores.
>
> 2 - Actually Eugene (one of her messages in this thread) had suggested to
> arrange the slots.
> I did and wrote the results, it delivers the cores randomly, nothing
> changed.
> but I haven't checked loadbalance option. -byslot or -bynode is not gonna
> help.
>
> 3 - Could you give me a bit more detail how affinity works? or what it does
> actually?
>
> Thanks a lot for your suggestions
>
> Saygin
>
>
> On Wed, Aug 11, 2010 at 6:18 PM, Gus Correa <g...@ldeo.columbia.edu> wrote:
>
>> Hi Saygin
>>
>> You could:
>>
>> 1) turn off hyperthreading (on BIOS), or
>>
>> 2) use the mpirun options (you didn't send your mpirun command)
>> to distribute the processes across the nodes, cores, etc.
>> "man mpirun" is a good resource, see the explanations about
>> the -byslot, -bynode, -loadbalance options.
>>
>> 3) In addition, you can use the mca parameters to set processor affinity
>> in the mpirun command line "mpirun -mca mpi_paffinity_alone 1 ..."
>> I don't know how this will play in a hyperthreaded machine,
>> but it works fine in our dual processor quad-core computers
>> (not hyperthreaded).
>>
>> Depending on your code, hyperthreading may not help performance anyway.
>>
>> I hope this helps,
>> Gus Correa
>>
>> Saygin Arkan wrote:
>>
>>> Hello,
>>>
>>> I'm running mpi jobs in non-homogeneous cluster. 4 of my machines have
>>> the following properties, os221, os222, os223, os224:
>>>
>>> vendor_id       : GenuineIntel
>>> cpu family      : 6
>>> model           : 23
>>> model name      : Intel(R) Core(TM)2 Quad  CPU   Q9300  @ 2.50GHz
>>> stepping        : 7
>>> cache size      : 3072 KB
>>> physical id     : 0
>>> siblings        : 4
>>> core id         : 3
>>> cpu cores       : 4
>>> fpu             : yes
>>> fpu_exception   : yes
>>> cpuid level     : 10
>>> wp              : yes
>>> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>>> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
>>> nx lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl vmx smx
>>> est tm2 ssse3 cx16 xtpr sse4_1 lahf_lm
>>> bogomips        : 4999.40
>>> clflush size    : 64
>>> cache_alignment : 64
>>> address sizes   : 36 bits physical, 48 bits virtual
>>>
>>> and the problematic, hyper-threaded 2 machines are as follows, os228 and
>>> os229:
>>>
>>> vendor_id       : GenuineIntel
>>> cpu family      : 6
>>> model           : 26
>>> model name      : Intel(R) Core(TM) i7 CPU         920  @ 2.67GHz
>>> stepping        : 5
>>> cache size      : 8192 KB
>>> physical id     : 0
>>> siblings        : 8
>>> core id         : 3
>>> cpu cores       : 4
>>> fpu             : yes
>>> fpu_exception   : yes
>>> cpuid level     : 11
>>> wp              : yes
>>> flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
>>> mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall
>>> nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good pni monitor ds_cpl
>>> vmx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm ida
>>> bogomips        : 5396.88
>>> clflush size    : 64
>>> cache_alignment : 64
>>> address sizes   : 36 bits physical, 48 bits virtual
>>>
>>>
>>> The problem is: those 2 machines seem to be having 8 cores (virtually,
>>> actualy core number is 4).
>>> When I submit an MPI job, I calculated the comparison times in the
>>> cluster. I got strange results.
>>>
>>> I'm running the job on 6 nodes, 3 core per node. And sometimes ( I can
>>> say 1/3 of the tests) os228 or os229 returns strange results. 2 cores are
>>> slow (slower than the first 4 nodes) but the 3rd core is extremely fast.
>>>
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - RANK(0) Printing
>>> Times...
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(1)
>>>  :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os222 RANK(2)
>>>  :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os224 RANK(3)
>>>  :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os228 RANK(4)
>>>  :37 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os229 RANK(5)
>>>  :34 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os223 RANK(6)
>>>  :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(7)
>>>  :39 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os222 RANK(8)
>>>  :37 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os224 RANK(9)
>>>  :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os228 RANK(10)
>>>  :*48 sec*
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os229 RANK(11)
>>>  :35 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os223 RANK(12)
>>>  :38 sec
>>> 2010-08-05 14:30:58,926 50672 DEBUG [0x7fcadf98c740] - os221 RANK(13)
>>>  :37 sec
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os222 RANK(14)
>>>  :37 sec
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os224 RANK(15)
>>>  :38 sec
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os228 RANK(16)
>>>  :*43 sec*
>>> 2010-08-05 14:30:58,926 50673 DEBUG [0x7fcadf98c740] - os229 RANK(17)
>>>  :35 sec
>>> TOTAL CORRELATION TIME: 48 sec
>>>
>>>
>>> or another test:
>>>
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - RANK(0) Printing
>>> Times...
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os221 RANK(1)
>>>  :170 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os222 RANK(2)
>>>  :161 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os224 RANK(3)
>>>  :158 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os228 RANK(4)
>>>  :142 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os229 RANK(5)
>>>  :*256 sec*
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os223 RANK(6)
>>>  :156 sec
>>> 2010-08-09 15:28:10,947 272904 DEBUG [0x7f27dec27740] - os221 RANK(7)
>>>  :162 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os222 RANK(8)
>>>  :159 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os224 RANK(9)
>>>  :168 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os228 RANK(10)
>>>  :141 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os229 RANK(11)
>>>  :136 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os223 RANK(12)
>>>  :173 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os221 RANK(13)
>>>  :164 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os222 RANK(14)
>>>  :171 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os224 RANK(15)
>>>  :156 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os228 RANK(16)
>>>  :136 sec
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - os229 RANK(17)
>>>  :*250 sec*
>>> 2010-08-09 15:28:10,947 272905 DEBUG [0x7f27dec27740] - TOTAL CORRELATION
>>> TIME: 256 sec
>>>
>>>
>>> Do you have any idea? Why it is happening?
>>> I assume that it gives 2 jobs to 2 cores in os229, but actually those 2
>>> are one core.
>>> Do you have any idea? If you have, how can I fix it? because the longest
>>> time affects the whole time information. 100 sec delay is too much for 250
>>> sec comparison time,
>>> and it might have finish around 160 sec.
>>>
>>>
>>>
>>> --
>>> Saygin
>>>
>>>
>>> ------------------------------------------------------------------------
>>>
>>>
>>> _______________________________________________
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>> _______________________________________________
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saygin
>
>
>


-- 
Saygin

Re: [OMPI users] Hyper-thread architecture effect on MPI jobs

Reply via email to