Hi Kenton,

I've been digging around the last couple days and the approach of adding 
*MembranePolicy::getCanceler() 
-> kj::Maybe<kj::Canceler&>* seems to be OK when it comes to getting rid of 
outstanding requests from within the *MembraneRequestHook *and 
*MembraneHook*, but I've run into some snags emulating the behaviour of 
*MembranePolicy::onRevoked()* but synchronously:

Including just the above there is no way to determine whether a membrane 
has been revoked nor the exception to raise when attempting new requests. I 
have added two new functions to address this (which so happen to allow for 
backward compatibility):

   - *MembranePolicy::isRevoked() -> bool,* which not strictly necessary, 
   though aids comprehension, as next function could be made to return a 
   *kj::Maybe*.
   - *MembranePolicy::getRevocationReason() -> kj::Exception*, which is 
   expected to throw if called when *isRevoked()* would return *false*.

This allows us to reject not just outstanding requests but *future* requests 
(without relying on *onRevoked()*), where the new request promises are 
substituted with promises rejected given the exception provided by 
*MembranePolicy::getRevocationReason()*. This comes with the shortcoming 
that because we are no longer *notified* about revocation inside the hooks, 
we cannot permanently revoke existing capabilities (see *MembraneHook:: 
MembraneHook* where we replace the wrapped capability with *newBrokenCap()*). 
This causes a knock-on effect where if the policy is unrevoked, then all 
the membraned capabilities which were previously rejecting new requests 
will suddenly become active again, which I don't think is desirable 
behaviour.

The only thing I can think of to make this work like 
*MembranePolicy::onRevoked()* would be to register a callback, but then 
we'd need to return some sort of RAII subscription so that our hooks don't 
get called after being dropped by a client, for which I don't think KJ has 
any infrastructure. So *MembranePolicy* ends up with something like the 
following new virtual functions:

*MembranePolicy::getCanceler() -> kj::Maybe<kj::Canceler&>* // Wraps 
requests made in hooks
*MembranePolicy::onRevoked(kj::Function<void(kj::Exception)>&& callback) -> 
kj::Subscription*

The following diff shows what I've got so far (as per first suggested 
implementation): 
https://github.com/capnproto/capnproto/compare/master...Zentren:capnproto:sync-membrane-policy

Any thoughts?


Thanks,

Rowan Reeve

On Wednesday, April 5, 2023 at 8:28:18 PM UTC+2 ken...@cloudflare.com wrote:

Hi Rowan,

Right, MembranePolicy today uses exclusiveJoin() to effect an asynchronous 
revocation. The problem with such asynchronous revocation is that it 
provides no explicit guarantee of when it's safe to assume revocation has 
occurred. To use this safely, you probably need to detect when your 
objects' destructors are eventually called, when their refcount reaches 
zero, before you can assume they are no longer used.

What I'm saying is we should probably add a new feature to MembranePolicy 
that uses kj::Canceler instead of exclusiveJoin(). kj::Canceler allows 
synchronous revocation. I think this should probably be a feature of 
MembranePolicy; I agree it's annoying to force applications to do it 
manually. Perhaps we could deprecate the old exclusiveJoin() approach, too, 
as the synchronous approach seems strictly better.

Note that doing something like `onRevoked.then([]() { canceler.cancel(); 
})` doesn't really solve anything, since the `.then()` runs asynchronously, 
so once again you have no guarantee when it will take effect.

-Kenton

On Thu, Mar 30, 2023 at 4:25 PM Rowan Reeve <rowan...@gmail.com> wrote:

Hi Kenton,

No stress, your time is given freely and I appreciate it.

Your suggestion makes sense to allow an immediate method of cancelling 
outstanding requests wrapped inside a membrane. After a look over 
*membrane.c++*, I do not see a *kj::Canceller* in use, so I presume this is 
done using *kj::Promise::exclusiveJoin.* I think I see three scenarios 
being dealt with when *kj::MembranePolicy::onRevoked* resolves:

   1. Existing requests are eventually rejected, but the underlying call 
   path might still run depending on exclusive join resolution order (i.e. it 
   will run if made before *onRevoked *was resolved). [1] 
   
<https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L213>
   [2] 
   
<https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L225>
   2. New requests against a capability obtained before revocation are 
   rejected. [1] 
   
<https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L467>
   3. New requests against a capability obtained after revocation (replaced 
   with a dummy) are rejected.[1] 
   
<https://github.com/capnproto/capnproto/blob/v0.10.3/c%2B%2B/src/capnp/membrane.c%2B%2B#L331>
   
I think requests from (1) can be immediately cancelled given access to a 
*kj::Canceller* wrapping all membraned requests. I think given the dummy 
capability injected in (3), those requests are safely rejected as-is. I 
however have a concern with (2); is it guaranteed that these new requests 
will be resolved *after* *onRevoked* is processed? I'd presume requests 
would land up on the event queue in-order, but I just wonder if there could 
be any race conditions involved. *If* it is all processed in-order, is it 
then also safe to assume that *kj* will eagerly evaluate a 
onRevoked*.then([this]() 
{ canceller.cancel(); }* relying on the result of *onRevoked, *i.e. that *this 
*is still safe to use in the *MembraneHook* and/or *MembraneRequestHook*?

It's a pity that the user would need to be responsible for both manually 
cancelling outstanding requests in addition to rejecting the promises 
exposed by *kj::MembranePolicy::**onRevoked* (unless I'm missing 
something). I wonder, it seems like *kj::MembranePolicy::onRevoked* seems 
to be intended to produce promises from a *kj::ForkedPromise* under the 
hood, which itself seems to have been done as a convenience as this 
provides a single-producer/multi-consumer interface to this revocation 
"event", and *kj::Promise::exclusiveJoin* already existed to reject calls. 
Could another single-producer/multi-consumer *protected* interface be 
exposed by *kj::MembranePolicy* which handles all this inline, i.e. without 
going to the event loop but leaving the public interface unchanged?

Given your current and future feedback, could I raise an issue and look 
into creating a draft PR on GitHub to start exploring the change that 
you've suggested? I will probably only get to writing any code from the 
10th of April, so further discussion can occur here and/or on the issue in 
the meantime (whichever is preferred).

Look forward to hearing from you,

Rowan Reeve


On Monday, March 27, 2023 at 4:43:51 PM UTC+2 ken...@cloudflare.com wrote:

Hi Rowan,

Sorry for the slow reply, my inbox is overloaded as always.

Indeed, since the `onRevoked` mechanism is triggered by a promise, the 
actual revocation and cancellation occurs asynchronously. It's possible 
that some other promise will be queued in between the point where you 
resolve the revoker promise and when the revocation actually takes effect.

kj::Canceler has better behavior, in that all cancellation happens 
synchronously. But, capnp::Membrane does not currently use that. I have 
myself hit this a couple times and ended up using hacks like you suggest.

Perhaps we should extend MembranePolicy with `getCanceler()` that returns 
`kj::Maybe<kj::Canceler&>`. If non-null, the canceler wraps all promises 
and capabilities passed through the membrane.

-Kenton

On Mon, Mar 27, 2023 at 7:35 AM Rowan Reeve <rowan...@gmail.com> wrote:

I've added an ugly unit test to a branch on my GitHub to illustrate:

https://github.com/capnproto/capnproto/compare/master...Zentren:capnproto:membrane_issue?expand=1#diff-49ad79a4fffcbe88fcd8681ec67d49f5f6e5fc9010961c1b10ef1b462f0e957eR477

Note line 477 in *c++/src/capnp/membrane-test.c++* where I'd expect the 
request to have been cancelled as per membrane policy *onRevoked()* docs 
("all outstanding calls cancelled"). Looking at the behavior, it seems like 
chained promises in the request are not cancelled as part of this (only the 
initial *call(CallContext)* IF we have not yet entered its body).


Thanks,

Rowan Reeve
On Wednesday, March 15, 2023 at 3:42:39 PM UTC+2 Rowan Reeve wrote:

Hi Kenton,

I am encountering a problem where capabilities acting as views over some 
resources are intermittently causing segfaults. The capability is wrapped 
using *capnp::membrane* given a membrane policy where the promise returned 
by *onRevoked* can be rejected on-demand via a synchronous reject function 
(a kj::PromiseFulfillerPair is used to do this).

The resources may be destroyed together at any time, whereby the membrane 
managing the capabilities accessing the resource states is revoked. 
However, this does not seem to be an instantaneous operation (presumably 
due to revocation being managed by a promise), and I have encountered the 
following issue as a result:

Unresolved requests made before the membrane policy has been revoked and 
where the resource has since been destroyed are not cancelled but will 
rather resolve, accessing invalid memory.

The workaround I have found to address this issue is to add a flag and a 
*kj::Canceller* to the capability implementations whereby new requests are 
rejected if the flag is set, and in addition when the flag is first set, 
the canceler cancels all returned promises in cases where a chained promise 
was returned rather than *kj::READY_NOW*. However, this is very ugly and 
necessitates keeping around references to the capability implementations 
before they are converted to *::Client* objects (so that we can set that 
flag). I'm thinking that surely there has to be a better way I have not 
considered.

Do you have any thoughts on a better solution to this problem? If needed, I 
can try create a minimal reproducible example to illustrate.

In case it matters, OS is Ubuntu 20.04 and capnp version is 8.0.0, both 
currently contained by my production environment.

Thank you for your time,

Rowan Reeve

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to capnproto+...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/2d126940-b82e-4ef8-9f41-304d8a23c97cn%40googlegroups.com
 
<https://groups.google.com/d/msgid/capnproto/2d126940-b82e-4ef8-9f41-304d8a23c97cn%40googlegroups.com?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an 
email to capnproto+...@googlegroups.com.

To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/7a4e7362-7f02-48ee-a551-97437a3b62d9n%40googlegroups.com
 
<https://groups.google.com/d/msgid/capnproto/7a4e7362-7f02-48ee-a551-97437a3b62d9n%40googlegroups.com?utm_medium=email&utm_source=footer>
.

-- 
You received this message because you are subscribed to the Google Groups 
"Cap'n Proto" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to capnproto+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/capnproto/f8fbe4e9-754e-4aed-9ea1-a4435cece00dn%40googlegroups.com.

Reply via email to