Re: RFC: localcount_hadref() or localcount_trydarin()
Hi, On 2017/06/12 21:51, Taylor R Campbell wrote: >> Date: Mon, 12 Jun 2017 10:53:52 +0900 >> From: Kengo NAKAHARA>> >> I want to avoid detaching the encryption device while it is used by IPsec. >> That is, once someone creates Security Assocatation(SA) to call >> crypto_newsession(), the encryption device related the SA must not be >> detached until the SA is flushed(done crypto_freesession()) and the SA >> is not used(done crypto_dispatch() and cryptointr()). > > Why don't you just use a global reference count first? Is the latency > and scalability of crypto_newsession and crypto_freesession critical? > > I am not familiar with opencrypto -- maybe the answer is yes. But in > crypto_newsession, crypto_freesession, and cryptointr, you're still > acquiring a global lock. That defeats the purpose of using > localcount, which is to make the latency and scalability of > acquire/release no more than a CPU-local integer increment. Exactly. At first, I tried to remove global lock in a single step using localcount(9) and other tools, however I could not completely fix bugs. So, I changed my policy. I restructure locks and define lock-orders at first, and then I will remove global locks. # This lock restructure also has the purpose to fix packet reordering. # Yes, there are many tasks about opencrypto... That is also the reason why I suspend implementation the patches shown in my previous mail. Sorry to lack of talk again. I will read your next mail, but I think it take time for me to read it and consider the implementation. I reply this answer as soon as possible. Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: RFC: localcount_hadref() or localcount_trydarin()
On Mon, Jun 12, 2017 at 12:51:29PM +, Taylor R Campbell wrote: > > Date: Mon, 12 Jun 2017 10:53:52 +0900 > > From: Kengo NAKAHARA> > > > I want to avoid detaching the encryption device while it is used by IPsec. > > That is, once someone creates Security Assocatation(SA) to call > > crypto_newsession(), the encryption device related the SA must not be > > detached until the SA is flushed(done crypto_freesession()) and the SA > > is not used(done crypto_dispatch() and cryptointr()). > > Why don't you just use a global reference count first? Is the latency > and scalability of crypto_newsession and crypto_freesession critical? For many workloads, it will be, yes. This pair of operations will occur: * Once per SSL/TLS connection even if the connection is resumed, which is tens of thousands of times per second on a busy server, possibly even hundreds of thousands of times per second. This assumes someone has an SSL/TLS library that can efficiently use our kernel crypto, but there's at least one out there that I know of. With modern instruction-based accelerators rather than the DMA-and-interrupts style this probably matters less. * Once per Phase 2 IPsec association -- potentially tens of thousands per second in recovery from an outage -- this likely matters more to most users of our opencrypto today. -- Thor Lancelot Simont...@panix.com "We cannot usually in social life pursue a single value or a single moral aim, untroubled by the need to compromise with others." - H.L.A. Hart
Re: RFC: localcount_hadref() or localcount_trydarin()
> Date: Mon, 12 Jun 2017 10:53:52 +0900 > From: Kengo NAKAHARA> > I want to avoid detaching the encryption device while it is used by IPsec. > That is, once someone creates Security Assocatation(SA) to call > crypto_newsession(), the encryption device related the SA must not be > detached until the SA is flushed(done crypto_freesession()) and the SA > is not used(done crypto_dispatch() and cryptointr()). Why don't you just use a global reference count first? Is the latency and scalability of crypto_newsession and crypto_freesession critical? I am not familiar with opencrypto -- maybe the answer is yes. But in crypto_newsession, crypto_freesession, and cryptointr, you're still acquiring a global lock. That defeats the purpose of using localcount, which is to make the latency and scalability of acquire/release no more than a CPU-local integer increment.
Re: RFC: localcount_hadref() or localcount_trydarin()
Hi, On 2017/06/10 1:20, Taylor R Campbell wrote: >> Date: Fri, 9 Jun 2017 16:50:18 +0900 >> From: Kengo NAKAHARA>> >> I tried using localcount(9) for opencyrpto(9) to avoid detach encryption >> drivers. While I implement that, I want to something to use KASSERT which >> ensure someone have reference or not. >> In addition, localcount_drain() waits forever if someone has had reference. >> So, I want to function which returns in not so long time regardless of >> someone has reference or not. In other words, I want some function like >> mutex_tryenter() for mutex_enter(). > > This sounds questionable to me. If there is a reference remaining > that does not drain in a bounded time, that sounds like a bug that > should be addressed some other way. Can you explain how you want to > use this? Can you share preliminary diffs to opencrypto code with > localcount that motivate this? At first, I mention conclusion shortly, - I want to avoid detaching encryption device when it is used by IPsec - I implement naive design of that with localcount(9) - The patches show in the last - After that, the detaching process(drvctl(8)) waits infinity The details are below. I want to avoid detaching the encryption device while it is used by IPsec. That is, once someone creates Security Assocatation(SA) to call crypto_newsession(), the encryption device related the SA must not be detached until the SA is flushed(done crypto_freesession()) and the SA is not used(done crypto_dispatch() and cryptointr()). Thus, I can eliminate "migration job" in opencrypto. The migration job runs when encryption device is detached while encryption job is processing, that is, below two codes. [1] https://nxr.netbsd.org/xref/src/sys/opencrypto/crypto.c#1620 [2] https://nxr.netbsd.org/xref/src/sys/opencrypto/crypto.c#1300 The [2] codes(crypto_invoke()) is fast path, so I want to avoid calling crypto_newsession() which may sleep. Furthermore, the migration job can disturb crp_q order. I implement prototype which design is below. - acquire localcount(9) in crypto_newsession() - release localcount(9) in crypto_freesession() - acquire and release localcount(9) in crypto_dispatch() and cryptointr() - drain localcount(9) in crypto_unregister() Hmm, that force drvctl(8) which detach the encryption device wait infinity until the SA is flushed. And the drvctl(8) cannot be killed. It is not good. I think localcount_trydrain() is required to avoid the drvctl(8) infinity waiting simply. Here is the working patch series. I'm sorry to say the patches is old, so they are applied to crypto.c:r1.78 and cryptodev.h:r1.34. Furthermore, there is still some bugs - http://www.netbsd.org/~knakahara/opencrypto-localcount/0001-introduce-localcount.patch - http://www.netbsd.org/~knakahara/opencrypto-localcount/0002-remove-migrate-process.patch - http://www.netbsd.org/~knakahara/opencrypto-localcount/0003-avoid-localcount-acquire-after-drain.patch Could you comment this design or patches? Hmm, I begin to feel this usage is not fit to localcount(9) Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: RFC: localcount_hadref() or localcount_trydarin()
> Date: Fri, 9 Jun 2017 16:50:18 +0900 > From: Kengo NAKAHARA> > I tried using localcount(9) for opencyrpto(9) to avoid detach encryption > drivers. While I implement that, I want to something to use KASSERT which > ensure someone have reference or not. > In addition, localcount_drain() waits forever if someone has had reference. > So, I want to function which returns in not so long time regardless of > someone has reference or not. In other words, I want some function like > mutex_tryenter() for mutex_enter(). This sounds questionable to me. If there is a reference remaining that does not drain in a bounded time, that sounds like a bug that should be addressed some other way. Can you explain how you want to use this? Can you share preliminary diffs to opencrypto code with localcount that motivate this?
Re: RFC: localcount_hadref() or localcount_trydarin()
Hi, On 2017/06/09 17:18, Paul Goyette wrote: > On Fri, 9 Jun 2017, Kengo NAKAHARA wrote: >> I tried using localcount(9) for opencyrpto(9) to avoid detach encryption >> drivers. While I implement that, I want to something to use KASSERT which >> ensure someone have reference or not. >> In addition, localcount_drain() waits forever if someone has had reference. >> So, I want to function which returns in not so long time regardless of >> someone has reference or not. In other words, I want some function like >> mutex_tryenter() for mutex_enter(). // snip > I think you are trying to abuse the design of localcount(9) :) > > The problem here is, whether or not you return immediately, the > localcount has been marked for draining, and nothing else will ever > be able to acquire it. You've already set the totalp pointer to a > non-NULL value, so any further attempts to call localcount_drain() > or localcount_hadref() will fail in the KASSERT. > > So, let's assume that you call localcount_hadref(); if it returns > true you immediately call localcount_fini() (since it has already > been drained). But what happens if it returns false? You can > never go back and check the "total" again, so you will never know > when it is safe to call localcount_fini(). > > >> In the other design, rename the function name to "localcount_trydrain()" >> and invert return value true/false. > > This has the same problem. You WILL always drain the localcount, > whether or not you return quickly. But if the localcount is not > already drained, you will never be able to check again. > > Furthermore, the totalp pointer points to a value _on_the_stack_ > and the pointer is stored in the localcount itself. So, if you > allow the routine to return "early", there might still be other > places that hold references to this localcount, and they will > eventually call localcount_release(). When that happens, they > will try to decrement the total value, and this may corrupt the > stack of some other routine! > > > As I wrote in the man-page, draining the localcount is a one-shot > operation - once you start the process of draining (by setting the > totalp and issuing the xcall) you cannot stop the process, nor can > you restart the process (to get an updated count). Additionally, > once you start the process of draining, you _must_ wait for it to > complete; Thank you for your detailed comments! I read man again carefully..., yes, to the last. Hmm... I see. I rethink opencrypto(9) design to use not only localcount(9) but also any other tools to avoid waiting. Anyway, thank you for your expositions. > you cannot escape. > > :) Oh, my resistance is futile Thanks, -- // Internet Initiative Japan Inc. Device Engineering Section, IoT Platform Development Department, Network Division, Technology Unit Kengo NAKAHARA
Re: RFC: localcount_hadref() or localcount_trydarin()
On Fri, 9 Jun 2017, Kengo NAKAHARA wrote: Hi, I tried using localcount(9) for opencyrpto(9) to avoid detach encryption drivers. While I implement that, I want to something to use KASSERT which ensure someone have reference or not. In addition, localcount_drain() waits forever if someone has had reference. So, I want to function which returns in not so long time regardless of someone has reference or not. In other words, I want some function like mutex_tryenter() for mutex_enter(). My naive implementation of that is here. bool localcount_hadref(struct localcount *lc, kmutex_t *interlock) { int64_t total = 0; KASSERT(mutex_owned(interlock)); KASSERT(lc->lc_totalp == NULL); /* Mark it draining. */ lc->lc_totalp = mutex_exit(interlock); xc_wait(xc_broadcast(0, _xc, lc, interlock)); mutex_enter(interlock); /* Paranoia: Cause any further use of lc->lc_totalp to crash. */ lc->lc_totalp = (void *)(uintptr_t)1; KASSERT(total >= NULL); if (total == 0) return false; else return true; } I think you are trying to abuse the design of localcount(9) :) The problem here is, whether or not you return immediately, the localcount has been marked for draining, and nothing else will ever be able to acquire it. You've already set the totalp pointer to a non-NULL value, so any further attempts to call localcount_drain() or localcount_hadref() will fail in the KASSERT. So, let's assume that you call localcount_hadref(); if it returns true you immediately call localcount_fini() (since it has already been drained). But what happens if it returns false? You can never go back and check the "total" again, so you will never know when it is safe to call localcount_fini(). In the other design, rename the function name to "localcount_trydrain()" and invert return value true/false. This has the same problem. You WILL always drain the localcount, whether or not you return quickly. But if the localcount is not already drained, you will never be able to check again. Furthermore, the totalp pointer points to a value _on_the_stack_ and the pointer is stored in the localcount itself. So, if you allow the routine to return "early", there might still be other places that hold references to this localcount, and they will eventually call localcount_release(). When that happens, they will try to decrement the total value, and this may corrupt the stack of some other routine! As I wrote in the man-page, draining the localcount is a one-shot operation - once you start the process of draining (by setting the totalp and issuing the xcall) you cannot stop the process, nor can you restart the process (to get an updated count). Additionally, once you start the process of draining, you _must_ wait for it to complete; you cannot escape. :) +--+--++ | Paul Goyette | PGP Key fingerprint: | E-mail addresses: | | (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com | | Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org | +--+--++