Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the 
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=12418



I think that this is to be expected - with one IOR continuing the eviction
processing time is suffering.  Adaptive timeouts will not help at all.

There is a case to be made for a policy that prioritizes the eviction requests
over IO requests. This will cause a small delay in IO requests from the
surviving IOR, and it will avoid the server processing useless IO that is still
in progress from the dying IOR.

There is a solution to this that is perhaps not difficult to implement.  When
eviction RPC's are beginning processing they raise a flag and they lower the
flag when they are done.  Multiple eviction threads can all raise and lower the
flag.

All IO processing threads check for the same flag both before network bulk
transfer and before disk IO (both can cause delays).  If the flag is (>0) the IO
threads wait until the flag is lowered to 0 and then it simply proceeds (without
holding the flag or something like that).

Our future plans call for server driven quality of service to give requests from
different sources different priorities.  This is a very simple example of this.

_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to