Regarding clients and OSS on same physical server. Seems to me the problem is not (directly) related to the amount of memory on the machine, but instead to different applications "competing" for the memory?

Could this possibly be resolved by running lustre in a virtual machine? Or would there be some other way to "partition" the memory in separate "batches" (or containers)? One for the application and one for the servers?

In most cases it seems wise to keep the servers separate from the clients, but e.g., in a "desk side", personal, smaller cluster (with basically only one user) it would be nice (better use of the resources) IF it would be possible to put servers and clients on the same machines.

/jon

On 07/15/2016 10:17 PM, Christopher J. Morrone wrote:
On 07/15/2016 12:11 PM, Cory Spitz wrote:
Chris,

On 7/13/16, 2:00 PM, "lustre-discuss on behalf of Christopher J. Morrone" 
<lustre-discuss-boun...@lists.lustre.org on behalf of morro...@llnl.gov> wrote:

If you put both the client and server code on the same node and do any
serious amount of IO, it has been pretty easy in the past to get that
node to go completely out to lunch thrashing on memory issues
Chris, you wrote “in the past.”  How current is your experience?  I’m sure it 
is still a good word of caution, but I’d venture that modern Lustre (on a 
modern kernel) might fare a tad bit better.  Does anyone have experience on 
current releases?
Pretty recent.

We have had memory management issues with servers and clients
independently at pretty much all periods of time, recent history
included.  Putting the components together only exacerbates the issues.

Lustre still has too many of its own caches with fixed, or nearly fixed
caches size, and places where it does not play well with the kernel
memory reclaim mechanisms.  There are too many places where lustre
ignores the kernels requests for memory reclaim, and often goes on to
use even more memory.  That significantly impedes the kernel's ability
to keep things responsive when memory contention arises.

I understand that it isn’t a design goal for us, but perhaps we should pay some 
attention to this possibility?  Perhaps we’ll have interest in co-locating 
clients on servers in the near future as part of a replication, network 
striping, or archiving capability?
There is going to need to be a lot of work to have Lustre's memory usage
be more dynamic, more aware of changing conditions on the system, and
more responsive to the kernel's requests to free memory.  I imagine it
won't be terribly easy, especially in areas such as dirty and unstable
data which cannot be freed until it is safe on disk.  But even for that,
there are no doubt ways to make things better.

Chris

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

_______________________________________________
lustre-discuss mailing list
lustre-discuss@lists.lustre.org
http://lists.lustre.org/listinfo.cgi/lustre-discuss-lustre.org

Reply via email to