Re: [lustre-discuss] Lustre OSS and clients on same physical server

2016-07-15 Thread Faaland, Olaf P.
Cory,

For what it¹s worth, the existing tests and framework run in the
single-node configuration without any special steps (or at least did
within the last year or so).  You just build lustre, run llmount to get
servers up and client mounted, and then run tests/sanity.sh.

You then get varying results each time you do this.  Some tests are
themselves flawed (ie racy), other tests are themselves fine but fail
intermittently because of some more general problem like memory management
issues.  The issues that arise typically aren¹t easy to diagnose in my
experience.

The problem is resources - using resources to investigate this behavior
instead of better testing in the typical multi-node configuration, or
implementing new features, or doing code cleanup, etc.  In other words,
sadly, the usual problem.

-Olaf

On 7/15/16, 1:38 PM, "lustre-discuss on behalf of Cory Spitz"

wrote:

>Good input, Chris.  Thanks.
>
>It sounds like we might need to move this over to lustre-devel.
>
>Someday, I¹d like to see us address some of these things and then add
>some test framework tests that co-locate clients with servers.  Not
>necessarily because we expect co-located services, but because it could
>be a useful driver of keeping Lustre a good memory manager.
>
>-Cory
>
>-- 
>
>
>On 7/15/16, 3:17 PM, "Christopher J. Morrone"  wrote:
>
>On 07/15/2016 12:11 PM, Cory Spitz wrote:
>> Chris,
>> 
>> On 7/13/16, 2:00 PM, "lustre-discuss on behalf of Christopher J.
>>Morrone" >morro...@llnl.gov> wrote:
>> 
>>> If you put both the client and server code on the same node and do any
>>> serious amount of IO, it has been pretty easy in the past to get that
>>> node to go completely out to lunch thrashing on memory issues
>> 
>> Chris, you wrote ³in the past.²  How current is your experience?  I¹m
>>sure it is still a good word of caution, but I¹d venture that modern
>>Lustre (on a modern kernel) might fare a tad bit better.  Does anyone
>>have experience on current releases?
>
>Pretty recent.
>
>We have had memory management issues with servers and clients
>independently at pretty much all periods of time, recent history
>included.  Putting the components together only exacerbates the issues.
>
>Lustre still has too many of its own caches with fixed, or nearly fixed
>caches size, and places where it does not play well with the kernel
>memory reclaim mechanisms.  There are too many places where lustre
>ignores the kernels requests for memory reclaim, and often goes on to
>use even more memory.  That significantly impedes the kernel's ability
>to keep things responsive when memory contention arises.
>
>> I understand that it isn¹t a design goal for us, but perhaps we should
>>pay some attention to this possibility?  Perhaps we¹ll have interest in
>>co-locating clients on servers in the near future as part of a
>>replication, network striping, or archiving capability?
>
>There is going to need to be a lot of work to have Lustre's memory usage
>be more dynamic, more aware of changing conditions on the system, and
>more responsive to the kernel's requests to free memory.  I imagine it
>won't be terribly easy, especially in areas such as dirty and unstable
>data which cannot be freed until it is safe on disk.  But even for that,
>there are no doubt ways to make things better.
>
>Chris
>
>
>
>___
>lustre-discuss mailing list
>lustre-discuss@lists.lustre.org
>http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre OSS and clients on same physical server

2016-07-15 Thread Cory Spitz
Good input, Chris.  Thanks.

It sounds like we might need to move this over to lustre-devel.

Someday, I’d like to see us address some of these things and then add some test 
framework tests that co-locate clients with servers.  Not necessarily because 
we expect co-located services, but because it could be a useful driver of 
keeping Lustre a good memory manager.

-Cory

-- 


On 7/15/16, 3:17 PM, "Christopher J. Morrone"  wrote:

On 07/15/2016 12:11 PM, Cory Spitz wrote:
> Chris,
> 
> On 7/13/16, 2:00 PM, "lustre-discuss on behalf of Christopher J. Morrone" 
>  
> wrote:
> 
>> If you put both the client and server code on the same node and do any
>> serious amount of IO, it has been pretty easy in the past to get that
>> node to go completely out to lunch thrashing on memory issues
> 
> Chris, you wrote “in the past.”  How current is your experience?  I’m sure it 
> is still a good word of caution, but I’d venture that modern Lustre (on a 
> modern kernel) might fare a tad bit better.  Does anyone have experience on 
> current releases?

Pretty recent.

We have had memory management issues with servers and clients
independently at pretty much all periods of time, recent history
included.  Putting the components together only exacerbates the issues.

Lustre still has too many of its own caches with fixed, or nearly fixed
caches size, and places where it does not play well with the kernel
memory reclaim mechanisms.  There are too many places where lustre
ignores the kernels requests for memory reclaim, and often goes on to
use even more memory.  That significantly impedes the kernel's ability
to keep things responsive when memory contention arises.

> I understand that it isn’t a design goal for us, but perhaps we should pay 
> some attention to this possibility?  Perhaps we’ll have interest in 
> co-locating clients on servers in the near future as part of a replication, 
> network striping, or archiving capability?

There is going to need to be a lot of work to have Lustre's memory usage
be more dynamic, more aware of changing conditions on the system, and
more responsive to the kernel's requests to free memory.  I imagine it
won't be terribly easy, especially in areas such as dirty and unstable
data which cannot be freed until it is safe on disk.  But even for that,
there are no doubt ways to make things better.

Chris



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre OSS and clients on same physical server

2016-07-15 Thread Christopher J. Morrone
On 07/15/2016 12:11 PM, Cory Spitz wrote:
> Chris,
> 
> On 7/13/16, 2:00 PM, "lustre-discuss on behalf of Christopher J. Morrone" 
>  
> wrote:
> 
>> If you put both the client and server code on the same node and do any
>> serious amount of IO, it has been pretty easy in the past to get that
>> node to go completely out to lunch thrashing on memory issues
> 
> Chris, you wrote “in the past.”  How current is your experience?  I’m sure it 
> is still a good word of caution, but I’d venture that modern Lustre (on a 
> modern kernel) might fare a tad bit better.  Does anyone have experience on 
> current releases?

Pretty recent.

We have had memory management issues with servers and clients
independently at pretty much all periods of time, recent history
included.  Putting the components together only exacerbates the issues.

Lustre still has too many of its own caches with fixed, or nearly fixed
caches size, and places where it does not play well with the kernel
memory reclaim mechanisms.  There are too many places where lustre
ignores the kernels requests for memory reclaim, and often goes on to
use even more memory.  That significantly impedes the kernel's ability
to keep things responsive when memory contention arises.

> I understand that it isn’t a design goal for us, but perhaps we should pay 
> some attention to this possibility?  Perhaps we’ll have interest in 
> co-locating clients on servers in the near future as part of a replication, 
> network striping, or archiving capability?

There is going to need to be a lot of work to have Lustre's memory usage
be more dynamic, more aware of changing conditions on the system, and
more responsive to the kernel's requests to free memory.  I imagine it
won't be terribly easy, especially in areas such as dirty and unstable
data which cannot be freed until it is safe on disk.  But even for that,
there are no doubt ways to make things better.

Chris

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Lustre OSS and clients on same physical server

2016-07-15 Thread Cory Spitz
Chris,

On 7/13/16, 2:00 PM, "lustre-discuss on behalf of Christopher J. Morrone" 
 wrote:

> If you put both the client and server code on the same node and do any
> serious amount of IO, it has been pretty easy in the past to get that
> node to go completely out to lunch thrashing on memory issues

Chris, you wrote “in the past.”  How current is your experience?  I’m sure it 
is still a good word of caution, but I’d venture that modern Lustre (on a 
modern kernel) might fare a tad bit better.  Does anyone have experience on 
current releases?

I understand that it isn’t a design goal for us, but perhaps we should pay some 
attention to this possibility?  Perhaps we’ll have interest in co-locating 
clients on servers in the near future as part of a replication, network 
striping, or archiving capability?

-Cory



___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Error on a zpool underlying an OST

2016-07-15 Thread Kevin Abbey

Hi Bob,

Thank you for the notes.  I began to examining the zpool before 
obtaining the new LSI card.  I was unable to start lustre without the 
new card. Once I installed the replacement and re-examined the zpools 
the resilvered pool was re-scrubbed, exported and reimported, and to my 
surprise repaired.  As a further test, I removed the spare disk that 
replaced the "apparent" bad disk and re-added the disk that was 
removed.  The zpool resilvered ok and scrubbed clean.  Lustre mounted 
and cleaned a few orphaned blocks but appeared fully functional from the 
client side.  However, without a "snapshot" (file list, md5sums - though 
zfs does internal check sums) of the prior status I cannot be sure if a 
data file was lost.  This is something I'll need to address. Maybe 
Robinhood can help with this?


Thanks again for the notes.  They will likely be useful in a similar 
scenario.


Kevin


On 07/12/2016 09:10 AM, Bob Ball wrote:
The answer came offline, and I guess I never replied back to the 
original posting.  This is what I learned.  It deals with only a 
single file, not 1000's.  --bob

---

On Mon, 14 Mar 2016, Bob Ball wrote:

OK, it would seem the affected user has already deleted this file, as 
the "lfs fid2path" returns:

[root@umt3int01 ~]# lfs fid2path /lustre/umt3 [0x22582:0xb5c0:0x0]
fid2path: error on FID [0x22582:0xb5c0:0x0]: No such file or 
directory


I verified I could to it back and forth using a different file.

I am making one last check, with the OST re-activated (I had set it 
inactive on our MDT/MGS to keep new files off while figuring this out).


Nope, gone.  Time to do the clear and remove the snapshot.

Thanks for your help on this.

bob

On 3/14/2016 10:45 AM, Don Holmgren wrote:

 No, no downside.  The snapshot really is just used so that I can do this
 sort of repair live.

 Once you've found the Lustre OID with "find", for 
ll_decode_filter_fid to

 work you'll have to then umount the OST and remount as type lustre.

 Good luck!

 Don

Thank you!  This is very helpful.

I have no space to make a snapshot, so I will just umount this OST for 
a bit and remount it zfs.  Our users can take some off-time if we are 
not busy just then.


It will be an interesting process.  I'm all set to drain and remake 
though, should this method not work.  I was putting that off to start 
until later today as I've other issues just now. Since it would take 
me 2-3 days total to drain, remake and refill, your detailed method is 
far more likeable for me.


Just to be certain, other than the temporary unavailability of the 
Lustre file system, do you see any downside to not working from a 
snapshot?


bob


On 3/14/2016 10:21 AM, Don Holmgren wrote:

 Hi Bob -

 I only get the lustre-discuss digest, so am not sure how to reply to 
that

 whole list.  But I can reply directly to you regarding your posting
 (copied at the bottom).

 In the ZFS error message

errors: Permanent errors have been detected in the following files:
  ost-007/ost0030:<0x2c90f>

 0x2c90f is the ZFS inode number of the damaged item.  To turn this 
into a

 Lustre filename, do the following:

 1. First, you have to use "find" using that inode number to get the
 corresponding
Lustre object ID.  I do this via a ZFS snapshot, something like:

zfs snapshot ost-007/ost0030@mar14
mount -t zfs ost-007/ost0030@mar14 /mnt/snapshot
find /mnt/snapshot/O -inum 182543

 (note 0x2c90f = 182543 decimal).  This may return something like

/mnt/snapshot/O/0/d22/54

 if indeed the damaged item is a file object.


 2. OK, assuming the "find" did return a file object like above (in this
 case the
Lustre OID of the object is 54) you need to find the parent "FID" of
 that
OID.  Do this as follows on the OSS where you've mounted the 
snapshot:


[root@lustrenew3 ~]# ll_decode_filter_fid /mnt/snapshot/O/0/d22/54
/mnt/snapshot/O/0/d22/54: parent=[0x204010a:0x0:0x0] stripe=0


 3. That string "0x204010a:0x0:0x0" is related to the Lustre FID.
 You
can use "lfs fid2path" to convert this to a filename.  "lfs fid2path"
 must be
execute on a client of your Lustre filesystem.  And, on our 
Lustre, the

return string must be slightly altered (chopped up differently):

 [root@client ~]# lfs fid2path /djhzlus [0x20400:0x10a:0x0]
 /djhzlus/test/copy1/l6496f21b7075m00155m031/gauge/Coulomb/l6496f21b7075m00155m031-Coul_002 



Here /djhzlus was where the Lustre filesystem was mounted on my 
client

(client).  fid2path takes three numbers, in my case the first was
the first 9 hex digits of the return from ll_decode_filter_fid, and
 the
second was the last 5 hex digits (I supressed the leading zeros) and
 the
third was 0x0 (not sure whether this was the 2nd or 3rd field from
ll_decode_filter_fid.

You can always use "lfs path2fid" on your Lustre client against 
another

file in your filesystem to find the