Re: [Lustre-discuss] Lustre-discuss Digest, Vol 66, Issue 40

Rick Friedman Tue, 02 Aug 2011 08:08:55 -0700


*******************
Sent from my mobile
Apologies for typos


-----Original Message-----
From: lustre-discuss-requ...@lists.lustre.org 
[lustre-discuss-requ...@lists.lustre.org]
Received: Saturday, 30 Jul 2011, 2:00pm
To: lustre-discuss@lists.lustre.org [lustre-discuss@lists.lustre.org]
Subject: Lustre-discuss Digest, Vol 66, Issue 40



Send Lustre-discuss mailing list submissions to
        lustre-discuss@lists.lustre.org

To subscribe or unsubscribe via the World Wide Web, visit
        http://lists.lustre.org/mailman/listinfo/lustre-discuss
or, via email, send a message with subject or body 'help' to
        lustre-discuss-requ...@lists.lustre.org

You can reach the person managing the list at
        lustre-discuss-ow...@lists.lustre.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Lustre-discuss digest..."


Today's Topics:

   1. Re: Line rate performance for clients (Andreas Dilger)
   2. Re: Line rate performance for clients (Brock Palen)
   3. Random OST Numbers chosen in a stripe (Roger Spellman)


----------------------------------------------------------------------

Message: 1
Date: Fri, 29 Jul 2011 12:01:40 -0600
From: Andreas Dilger <adil...@whamcloud.com>
Subject: Re: [Lustre-discuss] Line rate performance for clients
To: Brock Palen <bro...@umich.edu>
Cc: lustre-discuss discuss <lustre-discuss@lists.lustre.org>
Message-ID: <fa55b2a9-a027-4982-a3fa-4bffa8b5e...@whamcloud.com>
Content-Type: text/plain; charset=us-ascii

On 2011-07-29, at 11:33 AM, Brock Palen wrote:
> I think this is a networking question.
> 
> We have lustre 1.8 clients with 1gig-e interfaces that according to ethtool 
> are running full duplex.
> 
> If I do the following:
> 
> cp /lustre/largeilfe.h5 /tmp/
> 
> I get 117MB/s
> 
> If I then use globus-url-copy to move that file from /tmp/ to -> remove tape 
> archive I get 117MB/s
> 
> If I go directly from  /lustre -> archive  I get 50MB/s,  

Strace your globus-url-copy and see what IO size it is using.  "cp" has long 
ago been modified to use the blocksize reported by stat(2) for copying, and 
Lustre reports a 2MB IO size for striped files (1MB for unstriped).  If your 
globus tool is using e.g. 4kB reads then it will be very inefficient for 
Lustre, but much less so than from /tmp.

> this is consistently reproducible.  It doesn't mater if I just copy a large 
> file on lustre to lustre,  or scp, or globus.  If I try to ingest and outgest 
> data I get what looks like half duplex performance. 
> 
> Anyone have ideas why I cannot do 1Gig-e full duplex?

I don't think this has anything to do with "full duplex".  117MB/s is pretty 
much  the maximum line rate for GigE (and pretty good for Lustre, if I do say 
so myself) in one direction.  There is presumably no data moving in the other 
direction at that time.

Cheers, Andreas
--
Andreas Dilger 
Principal Engineer
Whamcloud, Inc.





------------------------------

Message: 2
Date: Fri, 29 Jul 2011 14:15:42 -0400
From: Brock Palen <bro...@umich.edu>
Subject: Re: [Lustre-discuss] Line rate performance for clients
To: Andreas Dilger <adil...@whamcloud.com>
Cc: lustre-discuss discuss <lustre-discuss@lists.lustre.org>
Message-ID: <78bd437e-8f53-47df-9d87-a98849b4a...@umich.edu>
Content-Type: text/plain; charset=us-ascii



Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
bro...@umich.edu
(734)936-1985



On Jul 29, 2011, at 2:01 PM, Andreas Dilger wrote:

> On 2011-07-29, at 11:33 AM, Brock Palen wrote:
>> I think this is a networking question.
>> 
>> We have lustre 1.8 clients with 1gig-e interfaces that according to ethtool 
>> are running full duplex.
>> 
>> If I do the following:
>> 
>> cp /lustre/largeilfe.h5 /tmp/
>> 
>> I get 117MB/s
>> 
>> If I then use globus-url-copy to move that file from /tmp/ to -> remove tape 
>> archive I get 117MB/s
>> 
>> If I go directly from  /lustre -> archive  I get 50MB/s,  
> 
> Strace your globus-url-copy and see what IO size it is using.  "cp" has long 
> ago been modified to use the blocksize reported by stat(2) for copying, and 
> Lustre reports a 2MB IO size for striped files (1MB for unstriped).  If your 
> globus tool is using e.g. 4kB reads then it will be very inefficient for 
> Lustre, but much less so than from /tmp.
> 
>> this is consistently reproducible.  It doesn't mater if I just copy a large 
>> file on lustre to lustre,  or scp, or globus.  If I try to ingest and 
>> outgest data I get what looks like half duplex performance. 
>> 
>> Anyone have ideas why I cannot do 1Gig-e full duplex?
> 
> I don't think this has anything to do with "full duplex".  117MB/s is pretty 
> much  the maximum line rate for GigE (and pretty good for Lustre, if I do say 
> so myself) in one direction.  There is presumably no data moving in the other 
> direction at that time.

Ah I guess I wasn't clear, I only get 117MB/s when I do 'one direction on the 
network'  eg copy form lustre to /tmp (local drive)',   /tmp using globus out.

Its just when the client is reading form lustre and sending the data out at the 
same time that I only get 50MB/s.  

Does that make sense?  Is it even right for me to expect that I could combine 
the performance together and expect full speed in and full speed out if I can 
consistently get them independent of each other? 

> 
> Cheers, Andreas
> --
> Andreas Dilger 
> Principal Engineer
> Whamcloud, Inc.
> 
> 
> 
> 
> 



------------------------------

Message: 3
Date: Fri, 29 Jul 2011 16:49:28 -0400
From: "Roger Spellman" <roger.spell...@terascala.com>
Subject: [Lustre-discuss] Random OST Numbers chosen in a stripe
To: <lustre-discuss@lists.lustre.org>,  <wc-disc...@whamcloud.com>
Message-ID:
        <2c7de72b9bd00f44baeca5b0cbb8739501359...@hermes.terascala.com>
Content-Type: text/plain;       charset="iso-8859-1"

Suppose that I stripe a directory with the following command:

lfs setstripe  -c 4 .

On some of my systems, when I create file in the directory, the list of OSTs 
for a particular file is sequential, e.g.

   obdidx           objid          objid            group
    12               2            0x2                0
    13               2            0x2                0
    14               2            0x2                0
    15               2            0x2                0

On another one of my systems, when I create files in a similarly striped 
directory, I get seemingly random assignment, e.g.

For one file:

?? obdidx?????????? objid????????? objid??????????? group
??? 14??????????? 6884???????? 0x1ae4??????????????? 0
??? 46??????????? 6880???????? 0x1ae0??????? ????????0
???? 8??????????? 6883???????? 0x1ae3??????????????? 0
 ?? 29??????????? 6880???????? 0x1ae0??????????????? 0

For a different file:

?? obdidx?????????? objid????????? objid??????????? group
???? 13??????? ????6884???????? 0x1ae4??????????????? 0
???? 28??????????? 6880???????? 0x1ae0??????????????? 0
  ?? 44??????????? 6880???????? 0x1ae0??????????????? 0
???? 27??????????? 6880???????? 0x1ae0??????????????? 0

Why is this?  

How can I control it to always be sequential?

Thanks.

Roger Spellman
Staff Engineer
Terascala, Inc.
508-588-1501
www.terascala.com <http://www.terascala.com/>


------------------------------

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


End of Lustre-discuss Digest, Vol 66, Issue 40
**********************************************

_______________________________________________
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss

Re: [Lustre-discuss] Lustre-discuss Digest, Vol 66, Issue 40

Reply via email to