Thomas,
If you are positive that the two sets of clients are not reading files on other 
on the OSTs, I don't think there is anything at the Lustre level that 
communicates between OSSes to balance traffic or anything like that.

One possibility is congestion control at the network level, possibly at the 
switch?

Cheers, Andreas

On Jan 23, 2020, at 08:01, Thomas Roth <t.r...@gsi.de<mailto:t.r...@gsi.de>> 
wrote:

Hi all,

Lustre 2.10.6, 45 OSS with 7 OSTs each on ZFS 0.7.9, 3 MDTs (ldiskfs), clients 
2.10 and 2.12. Infiniband network, Mellanox FDR w half bisectional bandwidth.

A sample of ~250.000 files, stripe count 1, average size 100 MB. is read with 
dd, output > /dev/null.

The location of the files has been recorded, from this we have drawn up 
separate file lists for each OSS.


In the first run, one client reads the files on one OSS and gets a read 
performance X, e.g. 2 GB/s.

In the second run, this setup is simply multiplied by 10 or 40: Client 1 still 
reads from OSS 1, Client 2 works with the files on OSS2, client 3 with OSS 3, 
...

With only 12 pairs of this kind we see 2 or 3 pairs whose performance dropsto < 
500 MB/s. The other pairs keep the read rate as seen before. Once they have 
finished, the remaining 2 -3 pairs jump back to original performance.

When the runs are repeated, the affected OSS are not the same as before.

This should exclude effects of bad hardware: servers, disks, cables, switches.

Since this behaviour is reproducible, the effects of interactions with other 
jobs/users can also be excluded.




By now I am able to reproduce the behavior on a test system, same 
configuration, with just 2 client-OSS pairs, nobody else on there.

56 parallel dd processes on client 1, reading files on server 1: 440 MB/s
56 parallel dd processes on client 2, reading files on server 2: 1.6 GB/s

Then kill all processes on client 2. Client 1 continues, rising to 1.1 GB/s


These processes are not even visible on the MDS of this system, and from all I 
understand the metadata server should be the only connecting element between 
the two pairs?
How do they know about each other, who, what tells client-1-server-1 to keep it 
low while client-2 is working on server-1?

Curioser and curioser,
Thomas




--
--------------------------------------------------------------------
Thomas Roth
Department: Informationstechnologie
Location: SB3 2.291
Phone: +49-6159-71 1453  Fax: +49-6159-71 2986


GSI Helmholtzzentrum für Schwerionenforschung GmbH
Planckstraße 1, 64291 Darmstadt, Germany, www.gsi.de<http://www.gsi.de>

Commercial Register / Handelsregister: Amtsgericht Darmstadt, HRB 1528
Managing Directors / Geschäftsführung:
Professor Dr. Paolo Giubellino, Jörg Blaurock
Chairman of the Supervisory Board / Vorsitzender des GSI-Aufsichtsrats:
State Secretary / Staatssekretär Dr. Volkmar Dietz

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Cheers, Andreas
--
Andreas Dilger
Principal Lustre Architect
Whamcloud






_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to