[ 
https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15102641#comment-15102641
 ] 

James Clampffer commented on HDFS-9643:
---------------------------------------

Thanks for checking this out [~bobhansen]!

"Rather than having a lock in the cancellation handle, we should be able to get 
away with a std::atomic<bool>."
Sounds good.  I originally went down that path, there was some reason I 
switched to a lock that I can't remember at the moment.  It certainly looks 
like I can switch back to std::atomic_bool.

"The pipeline class already has a concept of annihilating error handling - if 
!status.ok(), skip over the rest of the pipeline and deliver an error. That 
would seem to be an opportune place to put in a cancellation check."
Sounds good to me.

"I think we need to have it close the file so that any outstanding async 
requests get their callback, and can respond to the cancellation. Having 
well-defined behavior after cancellation for stateful objects can be tricky; 
what should the position be? Are there cases where continuing streaming reads 
on a file with an undefined position would be useful? I suppose they could 
reset the position to a known good one and continue reading."
I agree.  For now maybe it's worth keeping it simple by avoiding doing much of 
anything other than setting the flag when CancelOperations is called on the FH 
and preventing that FH from initiating new operations.  It needs to stay around 
long enough for the ephemeral objects to see that the flag has changed.

"Cancelling the file handle, but still able to efficiently do preads is a nice 
feature. Cancelling in-flight preads is another aspect to be handled (although 
that could be a different JIRA."
In flight in the sense that control has been handed over to asio?  I'd like to 
push that into another jira for now to keep this patch relatively simple but I 
think that's very important to have working as well.

"It would need to be holding a lock preventing additional operations for that 
to make sense, and calling into consumer code while holding a lock is always a 
dangerous proposition. What's the use case for the callback? Is it compelling?"
The case I had in mind was having a synchronous application that does a bunch 
of prep work to set up it's own async ephemeral objects that in turn call into 
libhdfs++.  If that application wants to tear down it's ephemeral object(s) and 
reclaim buffers as quickly as possible it'd be nice to be able to check that 
it's safe to do so without each operation callback notifying the synchronous 
bits that the work is completed.  The more I think this case out the more it 
seems like that sort application should just be using the synchronous 
interfaces.

> libhdfs++: Support async cancellation of read operations
> --------------------------------------------------------
>
>                 Key: HDFS-9643
>                 URL: https://issues.apache.org/jira/browse/HDFS-9643
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: hdfs-client
>            Reporter: James Clampffer
>            Assignee: James Clampffer
>         Attachments: HDFS-9643.HDFS-8707.000.patch
>
>
> It should be possible for any thread to cancel operations in progress on a 
> FileHandle.  Any ephemeral objects created by the FileHandle should free 
> resources as quickly as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to