Re: [capnproto] Re: Correctly revoking capabilities holding onto resources
Hi Kenton, I've been digging around the last couple days and the approach of adding *MembranePolicy::getCanceler() -> kj::Maybe* seems to be OK when it comes to getting rid of outstanding requests from within the *MembraneRequestHook *and *MembraneHook*, but I've run into some snags emulating the behaviour of *MembranePolicy::onRevoked()* but synchronously: Including just the above there is no way to determine whether a membrane has been revoked nor the exception to raise when attempting new requests. I have added two new functions to address this (which so happen to allow for backward compatibility): - *MembranePolicy::isRevoked() -> bool,* which not strictly necessary, though aids comprehension, as next function could be made to return a *kj::Maybe*. - *MembranePolicy::getRevocationReason() -> kj::Exception*, which is expected to throw if called when *isRevoked()* would return *false*. This allows us to reject not just outstanding requests but *future* requests (without relying on *onRevoked()*), where the new request promises are substituted with promises rejected given the exception provided by *MembranePolicy::getRevocationReason()*. This comes with the shortcoming that because we are no longer *notified* about revocation inside the hooks, we cannot permanently revoke existing capabilities (see *MembraneHook:: MembraneHook* where we replace the wrapped capability with *newBrokenCap()*). This causes a knock-on effect where if the policy is unrevoked, then all the membraned capabilities which were previously rejecting new requests will suddenly become active again, which I don't think is desirable behaviour. The only thing I can think of to make this work like *MembranePolicy::onRevoked()* would be to register a callback, but then we'd need to return some sort of RAII subscription so that our hooks don't get called after being dropped by a client, for which I don't think KJ has any infrastructure. So *MembranePolicy* ends up with something like the following new virtual functions: *MembranePolicy::getCanceler() -> kj::Maybe* // Wraps requests made in hooks *MembranePolicy::onRevoked(kj::Function&& callback) -> kj::Subscription* The following diff shows what I've got so far (as per first suggested implementation): https://github.com/capnproto/capnproto/compare/master...Zentren:capnproto:sync-membrane-policy Any thoughts? Thanks, Rowan Reeve On Wednesday, April 5, 2023 at 8:28:18 PM UTC+2 ken...@cloudflare.com wrote: Hi Rowan, Right, MembranePolicy today uses exclusiveJoin() to effect an asynchronous revocation. The problem with such asynchronous revocation is that it provides no explicit guarantee of when it's safe to assume revocation has occurred. To use this safely, you probably need to detect when your objects' destructors are eventually called, when their refcount reaches zero, before you can assume they are no longer used. What I'm saying is we should probably add a new feature to MembranePolicy that uses kj::Canceler instead of exclusiveJoin(). kj::Canceler allows synchronous revocation. I think this should probably be a feature of MembranePolicy; I agree it's annoying to force applications to do it manually. Perhaps we could deprecate the old exclusiveJoin() approach, too, as the synchronous approach seems strictly better. Note that doing something like `onRevoked.then([]() { canceler.cancel(); })` doesn't really solve anything, since the `.then()` runs asynchronously, so once again you have no guarantee when it will take effect. -Kenton On Thu, Mar 30, 2023 at 4:25 PM Rowan Reeve wrote: Hi Kenton, No stress, your time is given freely and I appreciate it. Your suggestion makes sense to allow an immediate method of cancelling outstanding requests wrapped inside a membrane. After a look over *membrane.c++*, I do not see a *kj::Canceller* in use, so I presume this is done using *kj::Promise::exclusiveJoin.* I think I see three scenarios being dealt with when *kj::MembranePolicy::onRevoked* resolves: 1. Existing requests are eventually rejected, but the underlying call path might still run depending on exclusive join resolution order (i.e. it will run if made before *onRevoked *was resolved). [1] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L213> [2] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L225> 2. New requests against a capability obtained before revocation are rejected. [1] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L467> 3. New requests against a capability obtained after revocation (replaced with a dummy) are rejected.[1] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L331> I thi
Re: [capnproto] Re: Correctly revoking capabilities holding onto resources
Hi Kenton, No stress, your time is given freely and I appreciate it. Your suggestion makes sense to allow an immediate method of cancelling outstanding requests wrapped inside a membrane. After a look over *membrane.c++*, I do not see a *kj::Canceller* in use, so I presume this is done using *kj::Promise::exclusiveJoin.* I think I see three scenarios being dealt with when *kj::MembranePolicy::onRevoked* resolves: 1. Existing requests are eventually rejected, but the underlying call path might still run depending on exclusive join resolution order (i.e. it will run if made before *onRevoked *was resolved). [1] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L213> [2] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L225> 2. New requests against a capability obtained before revocation are rejected. [1] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L467> 3. New requests against a capability obtained after revocation (replaced with a dummy) are rejected.[1] <https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L331> I think requests from (1) can be immediately cancelled given access to a *kj::Canceller* wrapping all membraned requests. I think given the dummy capability injected in (3), those requests are safely rejected as-is. I however have a concern with (2); is it guaranteed that these new requests will be resolved *after* *onRevoked* is processed? I'd presume requests would land up on the event queue in-order, but I just wonder if there could be any race conditions involved. *If* it is all processed in-order, is it then also safe to assume that *kj* will eagerly evaluate a onRevoked*.then([this]() { canceller.cancel(); }* relying on the result of *onRevoked, *i.e. that *this *is still safe to use in the *MembraneHook* and/or *MembraneRequestHook*? It's a pity that the user would need to be responsible for both manually cancelling outstanding requests in addition to rejecting the promises exposed by *kj::MembranePolicy::**onRevoked* (unless I'm missing something). I wonder, it seems like *kj::MembranePolicy::onRevoked* seems to be intended to produce promises from a *kj::ForkedPromise* under the hood, which itself seems to have been done as a convenience as this provides a single-producer/multi-consumer interface to this revocation "event", and *kj::Promise::exclusiveJoin* already existed to reject calls. Could another single-producer/multi-consumer *protected* interface be exposed by *kj::MembranePolicy* which handles all this inline, i.e. without going to the event loop but leaving the public interface unchanged? Given your current and future feedback, could I raise an issue and look into creating a draft PR on GitHub to start exploring the change that you've suggested? I will probably only get to writing any code from the 10th of April, so further discussion can occur here and/or on the issue in the meantime (whichever is preferred). Look forward to hearing from you, Rowan Reeve On Monday, March 27, 2023 at 4:43:51 PM UTC+2 ken...@cloudflare.com wrote: > Hi Rowan, > > Sorry for the slow reply, my inbox is overloaded as always. > > Indeed, since the `onRevoked` mechanism is triggered by a promise, the > actual revocation and cancellation occurs asynchronously. It's possible > that some other promise will be queued in between the point where you > resolve the revoker promise and when the revocation actually takes effect. > > kj::Canceler has better behavior, in that all cancellation happens > synchronously. But, capnp::Membrane does not currently use that. I have > myself hit this a couple times and ended up using hacks like you suggest. > > Perhaps we should extend MembranePolicy with `getCanceler()` that returns > `kj::Maybe`. If non-null, the canceler wraps all promises > and capabilities passed through the membrane. > > -Kenton > > On Mon, Mar 27, 2023 at 7:35 AM Rowan Reeve wrote: > >> I've added an ugly unit test to a branch on my GitHub to illustrate: >> >> >> https://github.com/capnproto/capnproto/compare/master...Zentren:capnproto:membrane_issue?expand=1#diff-49ad79a4fffcbe88fcd8681ec67d49f5f6e5fc9010961c1b10ef1b462f0e957eR477 >> >> Note line 477 in *c++/src/capnp/membrane-test.c++* where I'd expect the >> request to have been cancelled as per membrane policy *onRevoked()* docs >> ("all outstanding calls cancelled"). Looking at the behavior, it seems like >> chained promises in the request are not cancelled as part of this (only the >> initial *call(CallContext)* IF we have not yet entered its body). >> >> >> Thanks, >>
[capnproto] Re: Correctly revoking capabilities holding onto resources
I've added an ugly unit test to a branch on my GitHub to illustrate: https://github.com/capnproto/capnproto/compare/master...Zentren:capnproto:membrane_issue?expand=1#diff-49ad79a4fffcbe88fcd8681ec67d49f5f6e5fc9010961c1b10ef1b462f0e957eR477 Note line 477 in *c++/src/capnp/membrane-test.c++* where I'd expect the request to have been cancelled as per membrane policy *onRevoked()* docs ("all outstanding calls cancelled"). Looking at the behavior, it seems like chained promises in the request are not cancelled as part of this (only the initial *call(CallContext)* IF we have not yet entered its body). Thanks, Rowan Reeve On Wednesday, March 15, 2023 at 3:42:39 PM UTC+2 Rowan Reeve wrote: > Hi Kenton, > > I am encountering a problem where capabilities acting as views over some > resources are intermittently causing segfaults. The capability is wrapped > using *capnp::membrane* given a membrane policy where the promise > returned by *onRevoked* can be rejected on-demand via a synchronous > reject function (a kj::PromiseFulfillerPair is used to do this). > > The resources may be destroyed together at any time, whereby the membrane > managing the capabilities accessing the resource states is revoked. > However, this does not seem to be an instantaneous operation (presumably > due to revocation being managed by a promise), and I have encountered the > following issue as a result: > > Unresolved requests made before the membrane policy has been revoked and > where the resource has since been destroyed are not cancelled but will > rather resolve, accessing invalid memory. > > The workaround I have found to address this issue is to add a flag and a > *kj::Canceller* to the capability implementations whereby new requests > are rejected if the flag is set, and in addition when the flag is first > set, the canceler cancels all returned promises in cases where a chained > promise was returned rather than *kj::READY_NOW*. However, this is very > ugly and necessitates keeping around references to the capability > implementations before they are converted to *::Client* objects (so that > we can set that flag). I'm thinking that surely there has to be a better > way I have not considered. > > Do you have any thoughts on a better solution to this problem? If needed, > I can try create a minimal reproducible example to illustrate. > > In case it matters, OS is Ubuntu 20.04 and capnp version is 8.0.0, both > currently contained by my production environment. > > Thank you for your time, > > Rowan Reeve > -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/2d126940-b82e-4ef8-9f41-304d8a23c97cn%40googlegroups.com.
[capnproto] Correctly revoking capabilities holding onto resources
Hi Kenton, I am encountering a problem where capabilities acting as views over some resources are intermittently causing segfaults. The capability is wrapped using *capnp::membrane* given a membrane policy where the promise returned by *onRevoked* can be rejected on-demand via a synchronous reject function (a kj::PromiseFulfillerPair is used to do this). The resources may be destroyed together at any time, whereby the membrane managing the capabilities accessing the resource states is revoked. However, this does not seem to be an instantaneous operation (presumably due to revocation being managed by a promise), and I have encountered the following issue as a result: Unresolved requests made before the membrane policy has been revoked and where the resource has since been destroyed are not cancelled but will rather resolve, accessing invalid memory. The workaround I have found to address this issue is to add a flag and a *kj::Canceller* to the capability implementations whereby new requests are rejected if the flag is set, and in addition when the flag is first set, the canceler cancels all returned promises in cases where a chained promise was returned rather than *kj::READY_NOW*. However, this is very ugly and necessitates keeping around references to the capability implementations before they are converted to *::Client* objects (so that we can set that flag). I'm thinking that surely there has to be a better way I have not considered. Do you have any thoughts on a better solution to this problem? If needed, I can try create a minimal reproducible example to illustrate. In case it matters, OS is Ubuntu 20.04 and capnp version is 8.0.0, both currently contained by my production environment. Thank you for your time, Rowan Reeve -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/cfa7c8a3-bb32-4a01-882a-cc4a6590d23an%40googlegroups.com.
Re: [capnproto] Fine grained control over event queue
Hi again Kenton, Thank you for your valuable input. Yes, agreed that it's a bit precarious but it'd be a useful stop-gap that is currently tested and working for us, albeit with a much rougher self-maintained patch. For more context about our situation: we're constrained by a large existing software stack running on an embedded system that has been extended to use capnp for some IPC but still assumes a single real-time thread. The system configuration is such that that the main thread runs on its own dedicated physical core to meet some bandwidth requirements for I/O over Ethernet, and introducing another loop or thread could negatively impact that (locks on shared buffers or message passing). We'd need quite some time to investigate, test and benchmark any changes in that direction. I will be raising a PR on GitHub later today with the first solution (optionally limiting event loop turns on poll) similar to the previously linked pastebin and you can see how you feel about it whenever you find time to review. Thanks, Rowan Reeve On Wednesday, June 22, 2022 at 10:01:07 PM UTC+2 ken...@cloudflare.com wrote: > Hi Rowan, > > Hmm, this feels to me like more of a use case for threads or signal > handlers. Trying to limit processing time by limiting turn count seems > precarious to me as the amount of time spent on any turn could vary wildly. > > That said the change you propose seems not too bad and I'd accept it, I > just worry it's the tip of the iceberg here. Like will you find out that > some particular callbacks in the RPC system can be especially expensive, > and need to be counted as multiple turns? I think that would be going to > far. > > If by any chance your synchronous processing could be made safe to run in > a signal handler (requires using lockless data structures), I think a timer > signal could be a pretty neat solution here. > > -Kenton > > On Wed, Jun 22, 2022 at 2:17 PM Rowan Reeve wrote: > >> Hi Kenton, >> >> Our company is attempting to use Cap'n Proto in a soft-realtime system >> where we need tight control over the resolution of promises. Our problem is >> that we have some existing synchronous processing which needs to run >> frequently and timely (on the order of 100 microseconds) but our system >> must also service incoming RPCs (which might access results of said >> processing) without missing a single processing window. >> >> What we would like to be able to do roughly corresponds the below >> pseudocode inside our server: >> >> loop forever: >> if before synchronous processing window: >> poll single event >> else: >> do synchronous processing >> >> *WaitScope::poll* almost fulfills our needs, but it causes all new >> events since the last poll to be picked up and their corresponding promises >> to be resolved, in this case to incoming requests, which can occasionally >> cause us to miss our window. Our individual RPC functions are relatively >> short-lived, but when receiving many in a short span of time, we encounter >> this timing issue. >> >> I think what we're looking for is an overload to *WaitScope::poll* >> limiting the turn count: >> https://pastebin.com/T4eFM60F >> >> Your thoughts? Otherwise could this be something that might be accepted >> as a small pull request? >> >> Useful details: >> >>- C++ 17 with official capnp/kj library >>- Linux 5.x kernel (so UnixEventPort as async provider) >> >> >> >> Rowan Reeve >> >> -- >> You received this message because you are subscribed to the Google Groups >> "Cap'n Proto" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to capnproto+...@googlegroups.com. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/capnproto/17eba638-4bbe-49fe-a270-c5e40b40b967n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/capnproto/17eba638-4bbe-49fe-a270-c5e40b40b967n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/fe46595c-47ab-4bab-b6fe-1387bdde68a1n%40googlegroups.com.
[capnproto] Fine grained control over event queue
Hi Kenton, Our company is attempting to use Cap'n Proto in a soft-realtime system where we need tight control over the resolution of promises. Our problem is that we have some existing synchronous processing which needs to run frequently and timely (on the order of 100 microseconds) but our system must also service incoming RPCs (which might access results of said processing) without missing a single processing window. What we would like to be able to do roughly corresponds the below pseudocode inside our server: loop forever: if before synchronous processing window: poll single event else: do synchronous processing *WaitScope::poll* almost fulfills our needs, but it causes all new events since the last poll to be picked up and their corresponding promises to be resolved, in this case to incoming requests, which can occasionally cause us to miss our window. Our individual RPC functions are relatively short-lived, but when receiving many in a short span of time, we encounter this timing issue. I think what we're looking for is an overload to *WaitScope::poll* limiting the turn count: https://pastebin.com/T4eFM60F Your thoughts? Otherwise could this be something that might be accepted as a small pull request? Useful details: - C++ 17 with official capnp/kj library - Linux 5.x kernel (so UnixEventPort as async provider) Rowan Reeve -- You received this message because you are subscribed to the Google Groups "Cap'n Proto" group. To unsubscribe from this group and stop receiving emails from it, send an email to capnproto+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/capnproto/17eba638-4bbe-49fe-a270-c5e40b40b967n%40googlegroups.com.