On Apr 17, 2015, at 11:37 AM, John Spray <john.sp...@redhat.com> wrote:
> On 17/04/2015 17:22, Jan Kara wrote:
>> On Fri 17-04-15 17:08:10, John Spray wrote:
>>> On 17/04/2015 16:43, Jan Kara wrote:
>>> In that case I'm confused -- why would ENOSPC be an appropriate use
>>> of this interface if the mount being entirely blocked would be
>>> inappropriate?  Isn't being unable to service any I/O a more
>>> fundamental and severe thing than being up and healthy but full?
>>> 
>>> Were you intending the interface to be exclusively for data
>>> integrity issues like checksum failures, rather than more general
>>> events about a mount that userspace would probably like to know
>>> about?
>>   Well, I'm not saying we cannot have those events for fs availability /
>> inavailability. I'm just saying I'd like to see some use for that first.
>> I don't want events to be added just because it's possible...
>> 
>> For ENOSPC we have thin provisioned storage and the userspace deamon
>> shuffling real storage underneath. So there I know the usecase.
>> 
> 
> Ah, OK.  So I can think of a couple of use cases:
> * a cluster scheduling service (think MPI jobs or docker containers) might 
> check for events like this.  If it can see the cluster filesystem is 
> unavailable, then it can avoid scheduling the job, so that the (multi-node) 
> application does not get hung on one node with a bad mount.  If it sees a 
> mount go bad (unavailable, or client evicted) partway through a job, then it 
> can kill -9 the process that was relying on the bad mount, and go run it 
> somewhere else.
> * Boring but practical case: a nagios health check for checking if mounts are 
> OK.

John,
thanks for chiming in, as I was just about to write the same.  Some users
were just asking yesterday at the Lustre User Group meeting about adding
an interface to notify job schedulers for your #1 point, and I'd much
rather use a generic interface than inventing our own for Lustre.

Cheers, Andreas

> We don't have to invent these event types now of course, but something to 
> bear in mind.  Hopefully if/when any of the distributed filesystems 
> (Lustre/Ceph/etc) choose to implement this, we can look at making the event 
> types common at that time though.
> 
> BTW in any case an interface for filesystem events to userspace will be a 
> useful addition, thank you!
> 
> Cheers,
> John


Cheers, Andreas





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to