westonpace commented on a change in pull request #9533: URL: https://github.com/apache/arrow/pull/9533#discussion_r579724055
########## File path: cpp/src/arrow/util/future.cc ########## @@ -250,30 +250,40 @@ class ConcreteFutureImpl : public FutureImpl { } void DoMarkFinishedOrFailed(FutureState state) { - { - // Lock the hypothetical waiter first, and the future after. - // This matches the locking order done in FutureWaiter constructor. - std::unique_lock<std::mutex> waiter_lock(global_waiter_mutex); - std::unique_lock<std::mutex> lock(mutex_); - - DCHECK(!IsFutureFinished(state_)) << "Future already marked finished"; - state_ = state; - if (waiter_ != nullptr) { - waiter_->MarkFutureFinishedUnlocked(waiter_arg_, state); - } + // Lock the hypothetical waiter first, and the future after. + // This matches the locking order done in FutureWaiter constructor. + std::unique_lock<std::mutex> waiter_lock(global_waiter_mutex); + std::unique_lock<std::mutex> lock(mutex_); + + DCHECK(!IsFutureFinished(state_)) << "Future already marked finished"; + state_ = state; + if (waiter_ != nullptr) { + waiter_->MarkFutureFinishedUnlocked(waiter_arg_, state); } + + // Waiters are alerted before callbacks are run + waiter_lock.unlock(); cv_.notify_all(); - // run callbacks, lock not needed since the future is finsihed by this - // point so nothing else can modify the callbacks list and it is safe - // to iterate. - // - // In fact, it is important not to hold the locks because the callback - // may be slow or do its own locking on other resources - for (auto&& callback : callbacks_) { - std::move(callback)(); + std::vector<Callback> callbacks_to_run = std::move(callbacks_); Review comment: Nevermind. Even with the change it still isn't very safe to try and rely on `AddCallback` ordering. Parent layers were doing things like... ``` if (fut.IsFinished()) { // Do something, this can run before all callbacks run } ``` Preventing the future itself from being marked finished until after all the callbacks run seems like too much complexity and performance hit. Instead, if something needs to ensure a callback is run it needs to use `Then` and return a new future even though this added a bit of complexity (need to store shared_ptr in the queue) to serial readback. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org