[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Resolution: Fixed Status: Resolved (was: Patch Available) Committed to HDFS-8707. Thanks for the reviews and feedback [~bobhansen]! > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-8643.HDFS-8707.004.patch, > HDFS-9643.HDFS-8707.000.patch, HDFS-9643.HDFS-8707.001.patch, > HDFS-9643.HDFS-8707.002.patch, HDFS-9643.HDFS-8707.003.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Attachment: HDFS-8643.HDFS-8707.004.patch Thanks for taking another look and catching that cancel call [~bobhansen]. I've changed it to a shutdown() + close(). The docs say calling shutdown first should keep things more portable. > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-8643.HDFS-8707.004.patch, > HDFS-9643.HDFS-8707.000.patch, HDFS-9643.HDFS-8707.001.patch, > HDFS-9643.HDFS-8707.002.patch, HDFS-9643.HDFS-8707.003.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Attachment: HDFS-9643.HDFS-8707.003.patch New patch: -resize array after remove_if -move DataConnectionImpl::Cancel implementation out of header -fix the comment [~bobhansen] mentioned > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch, > HDFS-9643.HDFS-8707.001.patch, HDFS-9643.HDFS-8707.002.patch, > HDFS-9643.HDFS-8707.003.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Attachment: HDFS-9643.HDFS-8707.002.patch New Patch: -FileHandle now holds weak_ptrs to BlockReaders, when CancelOperations is called it will find all living BlockReaders and call CancelOperation on them. This eventually results in cancel being called on the underlying asio::tcp::socket inside of the reader's DataNodeConnection. This should help in the case [~bobhansen] mentioned when the pipeline is stalled waiting to time out during a network partition or something similar. > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch, > HDFS-9643.HDFS-8707.001.patch, HDFS-9643.HDFS-8707.002.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Attachment: HDFS-9643.HDFS-8707.001.patch New patch: -Added tests into the remote_block_reader_test because most of the infrastructure was already there. -Pushed all of the cancellation checking logic into Pipeline. The pipeline can optionally take a CancelHandle. It will also propagate out Cancel statuses returned from continuations. -CancelTracker uses a std::atomic_bool instead of a mutex+bool -FileHandle checks CancelHandle and bails out of any new operations if the handle has been canceled. > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch, > HDFS-9643.HDFS-8707.001.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Attachment: HDFS-9643.HDFS-8707.000.patch Initial patch, I've manually tested it but need to sort out the failures hidden by HDFS-9610 before I can write decent unit tests. Open questions: -Right now the cancel logic is added directly to each continuation in the remote block reader. On one hand this is simple and works, on the other it's boilerplate code. Is this worth pushing into the continuation pipeline code at the moment? I think it's worth keeping it simple until NN operations become cancelable. -In this implementation FileHandle::CancelOperations is irreversible and prevents it from being used again. Can anyone think of a reason not to have it also close the file or at least clear vector? -Should the FileHandle have a callback when it knows that there are no pending operations? Should be possible to just check the reference count on the CancelHandle to verify. > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HDFS-9643) libhdfs++: Support async cancellation of read operations
[ https://issues.apache.org/jira/browse/HDFS-9643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] James Clampffer updated HDFS-9643: -- Status: Patch Available (was: Open) > libhdfs++: Support async cancellation of read operations > > > Key: HDFS-9643 > URL: https://issues.apache.org/jira/browse/HDFS-9643 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: hdfs-client >Reporter: James Clampffer >Assignee: James Clampffer > Attachments: HDFS-9643.HDFS-8707.000.patch > > > It should be possible for any thread to cancel operations in progress on a > FileHandle. Any ephemeral objects created by the FileHandle should free > resources as quickly as possible. -- This message was sent by Atlassian JIRA (v6.3.4#6332)