Re: [lustre-discuss] Why reads are slower than writes in lustre

2021-09-28 Thread Colin Faber via lustre-discuss
The amount of data you're testing is far too small. Try upping it 10x so you have a longer run time and you achieve a more reasonable average. On Tue, Sep 28, 2021 at 6:02 PM Nagmat Nazarov wrote: > Dear Engineers, > > I have started working on a lustre file system. I have done a couple of >

[lustre-discuss] Why reads are slower than writes in lustre

2021-09-28 Thread Nagmat Nazarov
Dear Engineers, I have started working on a lustre file system. I have done a couple of experiments so far: On the first experiment I am writing 100 files each 10MB and the batch size is 4K. I got *704MB/s* bandwidth. On the second experiment I am (writing 10 files each 10MB and reading 1 10 MB

Re: [lustre-discuss] how to enforce traffic to OSS on o2ib1 only ?

2021-09-28 Thread Riccardo Veraldi
You are more than right. The IB interface 172.21.164.116 is not registered, only the TCP one is - { index: 234, event: add_uuid, nid: 172.21.156.102@tcp1(0x20001ac159c66), node: 172.21.156.102@tcp1 } - { index: 240, event: add_uuid, nid: 172.21.156.102@tcp1(0x20001ac159c66), node:

Re: [lustre-discuss] how to enforce traffic to OSS on o2ib1 only ?

2021-09-28 Thread Stephane Thiell via lustre-discuss
Hi Riccardo, I would check if the OSTs on this OSS have been registered with the correct NIDs (o2ib1) on the MGS: $ lctl --device MGS llog_print -client and look for the NIDs in setup/add_conn for the OSTs in question. Best, Stephane > On Sep 28, 2021, at 9:52 AM, Riccardo Veraldi >

[lustre-discuss] how to enforce traffic to OSS on o2ib1 only ?

2021-09-28 Thread Riccardo Veraldi
Hello. I have a lustre setup where the MDS (172.21.156.112)  is on tcp1 while the OSSes are on o2ib1. I am using Lustre 2.12.7 on RHEL 7.9 All the clients can see the MDS correctly as a tcp1 peer: peer:     - primary nid: 172.21.156.112@tcp1   Multi-Rail: True   peer ni:     -

Re: [lustre-discuss] Why reads are slower than writes on lustre file system?

2021-09-28 Thread Ben Evans via lustre-discuss
On the second experiment, you’re writing a total of 1000MB and reading 100MB. It could simply be that you’re not putting enough load on the system for long enough to get full performance. -Ben Evans From: lustre-discuss on behalf of Colin Faber via lustre-discuss Reply-To: Colin Faber

[lustre-discuss] LAD'21 starts today!

2021-09-28 Thread thomas.leibov...@cea.fr
Dear Lustre community, LAD'21 starts today! If you haven't already done so, you can still register for the webinar that will take place in the next 3 days. The presentations are broadcast twice a day to cover all the timezones: - One broadcast at 8AM UTC (10AM CEST Paris, Berlin,