Re: [lustre-discuss] Migrating files doesn't free space on the OST

2019-01-17 Thread Jason Williams
Chad hit the nail on the head.  I thought about the fact that it was still 
deactivated yesterday but was afraid to reactivate it until I verified the 
space was free.


FWIW, the URL about handling full OSTs does not include the fact that the space 
will not be free until you reactivate the OST.  It actually implies the 
opposite.


http://wiki.lustre.org/Handling_Full_OSTs




--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>




From: Chad DeWitt 
Sent: Thursday, January 17, 2019 3:07 PM
To: Jason Williams
Cc: Alexander I Kulyavtsev; lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Migrating files doesn't free space on the OST

Hi Jason,

I do not know if this will help you or not, but I had a situation in 2.8.0 
where an OST filled up and I marked it as disabled on the MDS:

lctl dl | grep osc
...Grab the device_id of the full OST and then deactivate it...
lctl --device device_id deactivate

IIRC, this allowed the data to be read, but deletes were not processed.  When I 
re-activated the OST, then the deletes were processed and space started 
clearing.  I think you stated you had the OST deactivated.  If you still do, 
try to reactive it.

lctl --device device_id activate

Once you reactivate the OST, the deletes will start processing within 10 - 30 
seconds...  Just use lfs df -h to watch...

-cd




Chad DeWitt, CISSP

UNC Charlotte | ITS – University Research Computing

9201 University City Blvd. | Charlotte, NC 28223

ccdew...@uncc.edu<mailto:ccdew...@uncc.edu> | www.uncc.edu




If you are not the intended recipient of this transmission or a person 
responsible for delivering it to the intended recipient, any disclosure, 
copying, distribution, or other use of any of the information in this 
transmission is strictly prohibited. If you have received this transmission in 
error, please notify me immediately by reply email or by telephone at 
704-687-7802. Thank you.


On Thu, Jan 17, 2019 at 2:38 PM Jason Williams 
mailto:jas...@jhu.edu>> wrote:

Hello Alexander,


Thank you for your reply.

- We are not using zfs, it's an LDISKFS backing store, so no snapshots.

- I have re-run lfs getstripe to make sure the file is indeed moving

- I just looked for lfsck but I don't seem to have it.  We are running 2.10.4 
so I don't know what version that appeared in.

- I will try to have a look into the jobstats and see what I can find, but I 
made sure the files I moved were not in use when I moved them.



--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>




From: Alexander I Kulyavtsev mailto:a...@fnal.gov>>
Sent: Thursday, January 17, 2019 12:56 PM
To: Jason Williams; 
lustre-discuss@lists.lustre.org<mailto:lustre-discuss@lists.lustre.org>
Subject: Re: Migrating files doesn't free space on the OST


- you can re-run command to find files residing on ost to see if files are new 
or old.

- zfs may have snapshots if you ever did snapshots; it takes space.

- removing data or snapshots has some lag to release the blocks (tens of 
minutes) but I guess that is completed by now.

- there are can be orphan objects on OST if you had crashes. On older lustre 
versions if the ost was emptied out you can mount underlying fs as ext4 or zfs; 
set mount to readonly and browse ost objects - you may see if there are some 
orphan objects left. On newer lustre releases you probably can run lfsck 
(lustre scanner).

- to find what hosts / jobs currently writing to lustre you may enable lustre 
jobstats; clear counters and parse stats files in /proc . There was xltop tool 
on github for older versions of lustre not having implemented jobstats but it 
was not updated for a while.

- depending on lustre version you have the implementation of lfs migrate is 
different. The older version copied file with other name to other ost, renamed 
files and removed old file. If migration done on file open for write by 
application the data will not be released until file closed (and data in new 
file are wrong). Recent implementation of migrate does swap of the file objects 
with file layout lock taken. I can not tell if it is safe for active write.

- not releasing space can be a bug - did you check jira on whamcloud? What 
version of lustre do you have? Is it ldiskfs or zfs based? zfs version?


Alex.



From: lustre-discuss 
mailto:lustre-discuss-boun...@lists.lustre.org>>
 on behalf of Jason Williams mailto:jas...@jhu.edu>>
Sent: Wednesday, January 16, 2019 10:25 AM
To: lustr

Re: [lustre-discuss] Migrating files doesn't free space on the OST

2019-01-17 Thread Jason Williams
Hello Alexander,


Thank you for your reply.

- We are not using zfs, it's an LDISKFS backing store, so no snapshots.

- I have re-run lfs getstripe to make sure the file is indeed moving

- I just looked for lfsck but I don't seem to have it.  We are running 2.10.4 
so I don't know what version that appeared in.

- I will try to have a look into the jobstats and see what I can find, but I 
made sure the files I moved were not in use when I moved them.



--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>




From: Alexander I Kulyavtsev 
Sent: Thursday, January 17, 2019 12:56 PM
To: Jason Williams; lustre-discuss@lists.lustre.org
Subject: Re: Migrating files doesn't free space on the OST


- you can re-run command to find files residing on ost to see if files are new 
or old.

- zfs may have snapshots if you ever did snapshots; it takes space.

- removing data or snapshots has some lag to release the blocks (tens of 
minutes) but I guess that is completed by now.

- there are can be orphan objects on OST if you had crashes. On older lustre 
versions if the ost was emptied out you can mount underlying fs as ext4 or zfs; 
set mount to readonly and browse ost objects - you may see if there are some 
orphan objects left. On newer lustre releases you probably can run lfsck 
(lustre scanner).

- to find what hosts / jobs currently writing to lustre you may enable lustre 
jobstats; clear counters and parse stats files in /proc . There was xltop tool 
on github for older versions of lustre not having implemented jobstats but it 
was not updated for a while.

- depending on lustre version you have the implementation of lfs migrate is 
different. The older version copied file with other name to other ost, renamed 
files and removed old file. If migration done on file open for write by 
application the data will not be released until file closed (and data in new 
file are wrong). Recent implementation of migrate does swap of the file objects 
with file layout lock taken. I can not tell if it is safe for active write.

- not releasing space can be a bug - did you check jira on whamcloud? What 
version of lustre do you have? Is it ldiskfs or zfs based? zfs version?


Alex.



From: lustre-discuss  on behalf of 
Jason Williams 
Sent: Wednesday, January 16, 2019 10:25 AM
To: lustre-discuss@lists.lustre.org
Subject: [lustre-discuss] Migrating files doesn't free space on the OST


I am trying to migrate files I know are not in use off of the full OST that I 
have using lfs migrate.  I have verified up and down that the files I am moving 
are on that OST and that after the migrate lfs getstripe indeed shows they are 
no longer on that OST since it's disabled in the MDS.


The problem is, the used space on the OST is not going down.


I see one of at least two issues:

- the OST is just not freeing the space for some reason or another ( I don't 
know)

- Or someone is writing to existing files just as fast as I am clearing the 
data (possible, but kind of hard to find)


Is there possibly something else I am missing? Also, does anyone know a good 
way to see if some client is writing to that OST and determine who it is if 
it's more probable that that is what is going on?



--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Migrating files doesn't free space on the OST

2019-01-16 Thread Jason Williams
I am trying to migrate files I know are not in use off of the full OST that I 
have using lfs migrate.  I have verified up and down that the files I am moving 
are on that OST and that after the migrate lfs getstripe indeed shows they are 
no longer on that OST since it's disabled in the MDS.


The problem is, the used space on the OST is not going down.


I see one of at least two issues:

- the OST is just not freeing the space for some reason or another ( I don't 
know)

- Or someone is writing to existing files just as fast as I am clearing the 
data (possible, but kind of hard to find)


Is there possibly something else I am missing? Also, does anyone know a good 
way to see if some client is writing to that OST and determine who it is if 
it's more probable that that is what is going on?



--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] lfs_migrate issue

2019-01-16 Thread Jason Williams
Hello Ahmed,


I'm rather new to the lfs_migrate command as well, but one thing to double 
check is after you run the migrate, have you done an


lfs getstripe


for a couple of the files it thinks it migrated to make sure they moved off of 
the OST?  Also, did you properly disable the OST in the MDS to make sure new 
files were not written to it?  Here is the document I was following: 
http://wiki.lustre.org/Handling_Full_OSTs



--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>



From: lustre-discuss  on behalf of 
Ahmed Fahmy 
Sent: Tuesday, January 15, 2019 6:33:24 AM
To: lustre-discuss@lists.lustre.org
Cc: Supercomputer
Subject: [lustre-discuss] lfs_migrate issue


Good day,

I have been trying to migrate the data one of the OSTs in my lustre file system 
before removing the OST.

I have used the following command:

lfs find --obd lustre-OST0001_UUID /lustre | lfs_migrate -sy

I believe the data has been copied correctly.


However, when I check the size of the directories that has been participating 
in the migration process, I notice that the size has increased with a 3 or 4 GB 
increase, even after removing the OST.

I am not sure what is the reason for this issue and how I can return those 
directories to their original sizes.

I am using lustre 2.10.1.

Any help would be appreciated.

Regards,

Ahmed Fahmy
Bibliotheca Alexandrina

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Jason Williams
Thank you again Rick.

One last question, How safe is lfs_migrate?  The man page on the installation 
says it's UNSAFE for possibly in-use files.  The lustre manual doesn't have the 
same warning and says something about it being a bit more integrated with the 
MDS.


http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.lfs_migrate


How safe do you think this would be to run on some files on the OST with it 
disabled on the MDS and active jobs running on the cluster?  I could do this by 
group, possibly to mitigate concerns of open files, if need be.


--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>




From: Mohr Jr, Richard Frank (Rick Mohr) 
Sent: Monday, January 7, 2019 1:07 PM
To: Jason Williams
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?



> On Jan 7, 2019, at 12:53 PM, Jason Williams  wrote:
>
> As I have gone through the testing, I think you may be right.  I think I 
> disabled the OST in a slightly different way and that caused issues.
>
> Do you happen to know where I could find out a bit more about what the "lctl 
> set_param osp..max_create_count=0” command would do?

The Lustre manual has a section on removing MDTs/OSTs:

http://doc.lustre.org/lustre_manual.xhtml#dbdoclet.deactivating_mdt_ost

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Jason Williams
As I have gone through the testing, I think you may be right.  I think I 
disabled the OST in a slightly different way and that caused issues.


Do you happen to know where I could find out a bit more about what the "lctl 
set_param osp..max_create_count=0” command would do?




--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>




From: Mohr Jr, Richard Frank (Rick Mohr) 
Sent: Monday, January 7, 2019 12:35 PM
To: Jason Williams
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?



Jason,

The results you described sound like the correct behavior when you deactivate 
an OST on the MDS.  When you run “lctl —device  deactivate”, you are 
essentially telling the MDS to ignore that OST when it assigns stripes to a new 
file.  The OST will still be visible to all clients and the MDS, which allows 
the clients to keep reading files from that OST and allows you to delete files 
from the OST.  The only down side is that any file that already exists on that 
OST can still be written to.  Deactivating an OST is intended to stop the flow 
of new data to that OST while you work on removing some of the existing data, 
but it doesn’t actually make the OST read-only.  I think you can get the same 
effect from Lustre 2.9 (or newer) by using "lctl set_param 
osp..max_create_count=0”.

I suspect that what you originally did was to deactivate the OST using 
something like "lctl conf_param .osc.active=0”.  This will notify all 
Lustre clients to deactivate the OST, which I believe causes the hangs you were 
seeing when any client tries to remove or stat a file on that OST.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


> On Jan 7, 2019, at 11:56 AM, Jason Williams  wrote:
>
> Sorry for the spam, but here are a few more interesting results:
>
> If I create a file that stripes only on the full OST, and then disable the 
> OST, I get the following:
>
>• I can over write the file within it's original size and it takes up 
> space on the "disabled" OST.
>• I can zero the file.
>• I can write more data to the file than it originally had, ie. the 
> original file before disabling the OST was 1G, I can overwrite the file with 
> > 1G with the OST disabled.
>• If I create a new file asking for that OST with the OST disabled, I 
> get a different OST
>
>
> # 4 and #2 are the only expected behavior.  I'm not sure what the behavior 
> should be in the case of #1 and #3.
>
>
> --
> Jason Williams
> Assistant Director
> Systems and Data Center Operations.
> Maryland Advanced Research Computing Center (MARCC)
> Johns Hopkins University
> jas...@jhu.edu
>
>
> From: lustre-discuss  on behalf of 
> Jason Williams 
> Sent: Monday, January 7, 2019 11:47:09 AM
> To: Mohr Jr, Richard Frank (Rick Mohr)
> Cc: lustre-discuss@lists.lustre.org
> Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without 
> hanging?
>
> So I found this: http://wiki.lustre.org/Handling_Full_OSTs which is what I 
> thought I had followed before but ran into hang issues.  I did some quick 
> testing with this and found that:
>
> 1. if I deactivate the OST in the MDS, no new files appear to be created on 
> that OST (expected behavior) and no hangs.
> 2. If I first create a file on the OST with it activated, then deactivate the 
> OST, and OVERWRITE a file what was spanned on that OST, the indexes stay the 
> same and the file successfully overwrites (the file spanned 4 OSTs, so 
> perhaps a little more testing with a single OST in the index is necessary)
> 3. Deactivating the OST shows it as inactive in the MDS but UP in the Client. 
> (not expected.)
> 4. I am able to delete a file that spans that OST with the OST deactivated, 
> no hang.
>
> I think the only thing here that concerns me a bit is #2.
>
> --
> Jason Williams
> Assistant Director
> Systems and Data Center Operations.
> Maryland Advanced Research Computing Center (MARCC)
> Johns Hopkins University
> jas...@jhu.edu
>
>
> From: lustre-discuss  on behalf of 
> Jason Williams 
> Sent: Sunday, January 6, 2019 5:22:16 PM
> To: Mohr Jr, Richard Frank (Rick Mohr)
> Cc: lustre-discuss@lists.lustre.org
> Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without 
> hanging?
>
> Hi Rick,
> I thought what I had done was disable it on the MDS, but perhaps I was 
> following the wrong instructions. Do you know where the best instructions for 
> what you are describing can be found? I would be willing 

Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Jason Williams
Sorry for the spam, but here are a few more interesting results:


If I create a file that stripes only on the full OST, and then disable the OST, 
I get the following:


  1.  I can over write the file within it's original size and it takes up space 
on the "disabled" OST.
  2.  I can zero the file.
  3.  I can write more data to the file than it originally had, ie. the 
original file before disabling the OST was 1G, I can overwrite the file with > 
1G with the OST disabled.
  4.  If I create a new file asking for that OST with the OST disabled, I get a 
different OST


# 4 and #2 are the only expected behavior.  I'm not sure what the behavior 
should be in the case of #1 and #3.



--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>



From: lustre-discuss  on behalf of 
Jason Williams 
Sent: Monday, January 7, 2019 11:47:09 AM
To: Mohr Jr, Richard Frank (Rick Mohr)
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?


So I found this: http://wiki.lustre.org/Handling_Full_OSTs which is what I 
thought I had followed before but ran into hang issues.  I did some quick 
testing with this and found that:


1. if I deactivate the OST in the MDS, no new files appear to be created on 
that OST (expected behavior) and no hangs.

2. If I first create a file on the OST with it activated, then deactivate the 
OST, and OVERWRITE a file what was spanned on that OST, the indexes stay the 
same and the file successfully overwrites (the file spanned 4 OSTs, so perhaps 
a little more testing with a single OST in the index is necessary)

3. Deactivating the OST shows it as inactive in the MDS but UP in the Client. 
(not expected.)

4. I am able to delete a file that spans that OST with the OST deactivated, no 
hang.

I think the only thing here that concerns me a bit is #2.


--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>



From: lustre-discuss  on behalf of 
Jason Williams 
Sent: Sunday, January 6, 2019 5:22:16 PM
To: Mohr Jr, Richard Frank (Rick Mohr)
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

Hi Rick,
I thought what I had done was disable it on the MDS, but perhaps I was 
following the wrong instructions. Do you know where the best instructions for 
what you are describing can be found? I would be willing to try again.

—
Sent you tersely from my phone
Jason Williams

From: Mohr Jr, Richard Frank (Rick Mohr) 
Sent: Sunday, January 6, 2019 4:56 PM
To: Jason Williams
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?



> On Jan 5, 2019, at 9:49 PM, Jason Williams  wrote:
>
> I have looked around the internet and found you can disable an OST, but when 
> I have tried that, any writes (including deletes) to the OST hang the clients 
> indefinitely. Does anyone know a way to make an OST basically "read-only" 
> with the exception of deletes so we can work to clear out the OST?

What command did you use to disable the OST?

There is a way to disable the OST on all the clients, but there is also a way 
to deactivate it on the MDS. The latter method should prevent the MDS from 
allocating any new files to the OST, but still allow clients to read and delete 
files on that OST.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-07 Thread Jason Williams
So I found this: http://wiki.lustre.org/Handling_Full_OSTs which is what I 
thought I had followed before but ran into hang issues.  I did some quick 
testing with this and found that:


1. if I deactivate the OST in the MDS, no new files appear to be created on 
that OST (expected behavior) and no hangs.

2. If I first create a file on the OST with it activated, then deactivate the 
OST, and OVERWRITE a file what was spanned on that OST, the indexes stay the 
same and the file successfully overwrites (the file spanned 4 OSTs, so perhaps 
a little more testing with a single OST in the index is necessary)

3. Deactivating the OST shows it as inactive in the MDS but UP in the Client. 
(not expected.)

4. I am able to delete a file that spans that OST with the OST deactivated, no 
hang.

I think the only thing here that concerns me a bit is #2.


--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>



From: lustre-discuss  on behalf of 
Jason Williams 
Sent: Sunday, January 6, 2019 5:22:16 PM
To: Mohr Jr, Richard Frank (Rick Mohr)
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

Hi Rick,
I thought what I had done was disable it on the MDS, but perhaps I was 
following the wrong instructions. Do you know where the best instructions for 
what you are describing can be found? I would be willing to try again.

—
Sent you tersely from my phone
Jason Williams

From: Mohr Jr, Richard Frank (Rick Mohr) 
Sent: Sunday, January 6, 2019 4:56 PM
To: Jason Williams
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?



> On Jan 5, 2019, at 9:49 PM, Jason Williams  wrote:
>
> I have looked around the internet and found you can disable an OST, but when 
> I have tried that, any writes (including deletes) to the OST hang the clients 
> indefinitely. Does anyone know a way to make an OST basically "read-only" 
> with the exception of deletes so we can work to clear out the OST?

What command did you use to disable the OST?

There is a way to disable the OST on all the clients, but there is also a way 
to deactivate it on the MDS. The latter method should prevent the MDS from 
allocating any new files to the OST, but still allow clients to read and delete 
files on that OST.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-06 Thread Jason Williams
Hi Rick,
I thought what I had done was disable it on the MDS, but perhaps I was 
following the wrong instructions. Do you know where the best instructions for 
what you are describing can be found? I would be willing to try again.

—
Sent you tersely from my phone
Jason Williams

From: Mohr Jr, Richard Frank (Rick Mohr) 
Sent: Sunday, January 6, 2019 4:56 PM
To: Jason Williams
Cc: lustre-discuss@lists.lustre.org
Subject: Re: [lustre-discuss] Full OST, any way of avoiding it without hanging?



> On Jan 5, 2019, at 9:49 PM, Jason Williams  wrote:
>
> I have looked around the internet and found you can disable an OST, but when 
> I have tried that, any writes (including deletes) to the OST hang the clients 
> indefinitely. Does anyone know a way to make an OST basically "read-only" 
> with the exception of deletes so we can work to clear out the OST?

What command did you use to disable the OST?

There is a way to disable the OST on all the clients, but there is also a way 
to deactivate it on the MDS. The latter method should prevent the MDS from 
allocating any new files to the OST, but still allow clients to read and delete 
files on that OST.

--
Rick Mohr
Senior HPC System Administrator
National Institute for Computational Sciences
http://www.nics.tennessee.edu


___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Full OST, any way of avoiding it without hanging?

2019-01-05 Thread Jason Williams
Hello,


We have a lustre system (version 2.10.4) that has unfortunately fallen victim 
to a 100% full OST... Every time we clear some space on it, the system fills it 
right back up again.


I have looked around the internet and found you can disable an OST, but when I 
have tried that, any writes (including deletes) to the OST hang the clients 
indefinitely.  Does anyone know a way to make an OST basically "read-only" with 
the exception of deletes so we can work to clear out the OST?  Or better yet, a 
way to "drain" or move files off an OST with a script (keeping in mind it might 
not be known if the files are in use at the time).  Or even a way to tell 
lustre "Hey don't write any new data here, but reading and removing data is OK."




--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Quota Reporting (for all users and/or gruops)

2018-12-20 Thread Jason Williams
It is entirely possible that this already exists, but my google-foo is not what 
it used to be.  However, I've searched around the internet and it seems as 
though it doesn't really exist.  There are handfuls of now defunct or 
un-maintained projects out there, but nothing that seems to report all of the 
user and/or group quotas.


Does anyone know of a good quota reporting tool that can give quota information 
in the same way as a 'repquota -u' or 'repquota -g' would?


--

Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


[lustre-discuss] Spiking OSS load?

2017-08-01 Thread Jason Williams
Hello,

First off, the Lustre that we run here is one that was installed by Intel, so 
figuring out the exact version seems to be a table lookup on a table internal 
to Intel, but I'm told it's probably 2.5-ish...

Recently, I finally installed a monitoring system on my OSS/MDS servers.  And 
over the last week or so, the OSS servers have been spiking to a 100+ load 
average (sometimes much higher.)  They are not going unresponsive, from what I 
can tell, and the processes that are causing it seem to be the ll_ost_io07_XXX 
processes (where XXX is a number) because they are going into "D" state (io 
wait state)

I recently attended the "The 3rd International Workshop on the Lustre 
Ecosystem" (GREAT WORKSHOP!!) and via a couple of the talks it got me thinking 
about tunables.

One tunable in particular was the ost.threads_max.  That guidance on the 
lustre.org says (1/128MB * num_cpu) which, on my system, works out to well over 
the max thread count allowable of 512. So my OSS machines are all set to a 
threads_max of 512 and indeed on all of the machines, the threads_started is 
512. (It's a VERY busy file system)

This leads me to the following questions (and possibly more, but let's start 
with these):


1)  Is 512 threads a reasonable setting or should it be lower?

2)  Is high load "normal" if the file system is under heavy use?  At the 
time I see a lot of open and attr calls which I thought would load the MDS over 
the OSS... but my under-the-hood understanding is limited at best.

3)  Should I be looking at other tunables?

I realize the information provided in this initial email is limited as well, so 
if you are curious about anything else, please let me know what else might be 
interesting.

Oh and as for the MDS/OSS setup, here's a brief overview too:

2x MDS in failover mode with one MDT
12x OSS in fail over pairs with 12 OST per pair 6 running on each OSS. (72 OST, 
6 active on each of 12 OSS for load sharing.)
Each OSS pair is hooked to the same set of 2x RAID Array (Dell 
MD3460)


--
Jason Williams
Assistant Director
Systems and Data Center Operations.
Maryland Advanced Research Computing Center (MARCC)
Johns Hopkins University
jas...@jhu.edu<mailto:jas...@jhu.edu>

___
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org


Re: [Lustre-discuss] Kernel Oops on the stock RHEL 4 kernel?

2008-10-10 Thread Jason Williams
Guy Coates wrote:
 Jason Williams wrote:
   
 Hi
 I have been playing around with lustre 1.6.5.1 as part of some testing
 that we are doing for an up and coming cluster.  I installed it on 2
 test machines, Dell 2950's with 8 GB of ram to be exact, and fired up a
 test file system.
 The test file system was very simple:

 /dev/sdb -  ~400GB for the MDT/MGS
 /dev/sdc - ~4TB for the OST

 

 Quick questions; did you mount the filesystem on the OST/MDS machine, or on a
 separate client? Mounting filesystems on OST/MDS nodes is not supported.

 Does it work with an OST  disk  2TB? support for disks  2TB is patchy
 depending on your exact disk controller hardware.

 Cheers,

 Guy

   
Hi Guy

Hmm, yes the file system is mounting on the OST/MDS node.Lustre 1.4 
seemed to not really have any issue with that.  I wonder why it's not 
supported. And thanks for the heads up on the  2TB support.  Looks like 
I get to go back to my boss and have a discussion about alternatives

--
Jason

___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss


Re: [Lustre-discuss] Kernel Oops on the stock RHEL 4 kernel?

2008-10-10 Thread Jason Williams
Brian J. Murrell wrote:
 On Fri, 2008-10-10 at 08:32 -0400, Jason Williams wrote:
   
 Hi
 

 Hi.

   
 --- [cut here ] - [please bite here ] -
 Kernel BUG at mballoc:1334
 

 This looks like bug 16101 fixed in 1.6.6.  There is a patch in that bug
 you can apply if you wish or you can wait for 1.6.6.  Before you ask
 though, I don't know when 1.6.6 will be released.

 b.


 ___
 Lustre-discuss mailing list
 Lustre-discuss@lists.lustre.org
 http://lists.lustre.org/mailman/listinfo/lustre-discuss
   
Brian,

What about Guy's comments about running the OST on a machine that also 
has the filesystem mounted via lustre client?  Is that still technically 
unsupported in 1.6.6?

--
Jason
___
Lustre-discuss mailing list
Lustre-discuss@lists.lustre.org
http://lists.lustre.org/mailman/listinfo/lustre-discuss