Re: RFC: localcount_hadref() or localcount_trydarin()

2017-06-12 Thread Kengo NAKAHARA
Hi,

On 2017/06/12 21:51, Taylor R Campbell wrote:
>> Date: Mon, 12 Jun 2017 10:53:52 +0900
>> From: Kengo NAKAHARA 
>>
>> I want to avoid detaching the encryption device while it is used by IPsec.
>> That is, once someone creates Security Assocatation(SA) to call
>> crypto_newsession(), the encryption device related the SA must not be
>> detached until the SA is flushed(done crypto_freesession()) and the SA
>> is not used(done crypto_dispatch() and cryptointr()).
> 
> Why don't you just use a global reference count first?  Is the latency
> and scalability of crypto_newsession and crypto_freesession critical?
> 
> I am not familiar with opencrypto -- maybe the answer is yes.  But in
> crypto_newsession, crypto_freesession, and cryptointr, you're still
> acquiring a global lock.  That defeats the purpose of using
> localcount, which is to make the latency and scalability of
> acquire/release no more than a CPU-local integer increment.

Exactly. At first, I tried to remove global lock in a single step using
localcount(9) and other tools, however I could not completely fix bugs.
So, I changed my policy. I restructure locks and define lock-orders at
first, and then I will remove global locks.
# This lock restructure also has the purpose to fix packet reordering.
# Yes, there are many tasks about opencrypto...

That is also the reason why I suspend implementation the patches shown
in my previous mail. Sorry to lack of talk again.

I will read your next mail, but I think it take time for me to read it
and consider the implementation. I reply this answer as soon as possible.


Thanks,

-- 
//
Internet Initiative Japan Inc.

Device Engineering Section,
IoT Platform Development Department,
Network Division,
Technology Unit

Kengo NAKAHARA 


Re: RFC: localcount_hadref() or localcount_trydarin()

2017-06-12 Thread Thor Lancelot Simon
On Mon, Jun 12, 2017 at 12:51:29PM +, Taylor R Campbell wrote:
> > Date: Mon, 12 Jun 2017 10:53:52 +0900
> > From: Kengo NAKAHARA 
> > 
> > I want to avoid detaching the encryption device while it is used by IPsec.
> > That is, once someone creates Security Assocatation(SA) to call
> > crypto_newsession(), the encryption device related the SA must not be
> > detached until the SA is flushed(done crypto_freesession()) and the SA
> > is not used(done crypto_dispatch() and cryptointr()).
> 
> Why don't you just use a global reference count first?  Is the latency
> and scalability of crypto_newsession and crypto_freesession critical?

For many workloads, it will be, yes.  This pair of operations will occur:

* Once per SSL/TLS connection even if the connection is resumed,
  which is tens of thousands of times per second on a busy server,
  possibly even hundreds of thousands of times per second.

  This assumes someone has an SSL/TLS library that can efficiently
  use our kernel crypto, but there's at least one out there that I
  know of.  With modern instruction-based accelerators rather than
  the DMA-and-interrupts style this probably matters less.

* Once per Phase 2 IPsec association -- potentially tens of
  thousands per second in recovery from an outage -- this likely
  matters more to most users of our opencrypto today.

-- 
  Thor Lancelot Simont...@panix.com

  "We cannot usually in social life pursue a single value or a single moral
   aim, untroubled by the need to compromise with others."  - H.L.A. Hart


Re: RFC: localcount_hadref() or localcount_trydarin()

2017-06-12 Thread Taylor R Campbell
> Date: Mon, 12 Jun 2017 10:53:52 +0900
> From: Kengo NAKAHARA 
> 
> I want to avoid detaching the encryption device while it is used by IPsec.
> That is, once someone creates Security Assocatation(SA) to call
> crypto_newsession(), the encryption device related the SA must not be
> detached until the SA is flushed(done crypto_freesession()) and the SA
> is not used(done crypto_dispatch() and cryptointr()).

Why don't you just use a global reference count first?  Is the latency
and scalability of crypto_newsession and crypto_freesession critical?

I am not familiar with opencrypto -- maybe the answer is yes.  But in
crypto_newsession, crypto_freesession, and cryptointr, you're still
acquiring a global lock.  That defeats the purpose of using
localcount, which is to make the latency and scalability of
acquire/release no more than a CPU-local integer increment.


Re: RFC: localcount_hadref() or localcount_trydarin()

2017-06-11 Thread Kengo NAKAHARA
Hi,

On 2017/06/10 1:20, Taylor R Campbell wrote:
>> Date: Fri, 9 Jun 2017 16:50:18 +0900
>> From: Kengo NAKAHARA 
>>
>> I tried using localcount(9) for opencyrpto(9) to avoid detach encryption
>> drivers. While I implement that, I want to something to use KASSERT which
>> ensure someone have reference or not.
>> In addition, localcount_drain() waits forever if someone has had reference.
>> So, I want to function which returns in not so long time regardless of
>> someone has reference or not. In other words, I want some function like
>> mutex_tryenter() for mutex_enter().
> 
> This sounds questionable to me.  If there is a reference remaining
> that does not drain in a bounded time, that sounds like a bug that
> should be addressed some other way.  Can you explain how you want to
> use this?  Can you share preliminary diffs to opencrypto code with
> localcount that motivate this?

At first, I mention conclusion shortly,
- I want to avoid detaching encryption device when it is used by IPsec
- I implement naive design of that with localcount(9)
  - The patches show in the last
- After that, the detaching process(drvctl(8)) waits infinity

The details are below.

I want to avoid detaching the encryption device while it is used by IPsec.
That is, once someone creates Security Assocatation(SA) to call
crypto_newsession(), the encryption device related the SA must not be
detached until the SA is flushed(done crypto_freesession()) and the SA
is not used(done crypto_dispatch() and cryptointr()).

Thus, I can eliminate "migration job" in opencrypto. The migration job
runs when encryption device is detached while encryption job is
processing, that is, below two codes.
   [1] https://nxr.netbsd.org/xref/src/sys/opencrypto/crypto.c#1620
   [2] https://nxr.netbsd.org/xref/src/sys/opencrypto/crypto.c#1300
The [2] codes(crypto_invoke()) is fast path, so I want to avoid calling
crypto_newsession() which may sleep. Furthermore, the migration job can
disturb crp_q order.

I implement prototype which design is below.
- acquire localcount(9) in crypto_newsession()
- release localcount(9) in crypto_freesession()
- acquire and release localcount(9) in crypto_dispatch() and cryptointr()
- drain localcount(9) in crypto_unregister()
Hmm, that force drvctl(8) which detach the encryption device wait infinity
until the SA is flushed. And the drvctl(8) cannot be killed. It is not good.
I think localcount_trydrain() is required to avoid the drvctl(8) infinity
waiting simply.

Here is the working patch series. I'm sorry to say the patches is old,
so they are applied to crypto.c:r1.78 and cryptodev.h:r1.34. Furthermore,
there is still some bugs
- 
http://www.netbsd.org/~knakahara/opencrypto-localcount/0001-introduce-localcount.patch
- 
http://www.netbsd.org/~knakahara/opencrypto-localcount/0002-remove-migrate-process.patch
- 
http://www.netbsd.org/~knakahara/opencrypto-localcount/0003-avoid-localcount-acquire-after-drain.patch

Could you comment this design or patches? 
Hmm, I begin to feel this usage is not fit to localcount(9)


Thanks,

-- 
//
Internet Initiative Japan Inc.

Device Engineering Section,
IoT Platform Development Department,
Network Division,
Technology Unit

Kengo NAKAHARA 



Re: RFC: localcount_hadref() or localcount_trydarin()

2017-06-09 Thread Taylor R Campbell
> Date: Fri, 9 Jun 2017 16:50:18 +0900
> From: Kengo NAKAHARA 
> 
> I tried using localcount(9) for opencyrpto(9) to avoid detach encryption
> drivers. While I implement that, I want to something to use KASSERT which
> ensure someone have reference or not.
> In addition, localcount_drain() waits forever if someone has had reference.
> So, I want to function which returns in not so long time regardless of
> someone has reference or not. In other words, I want some function like
> mutex_tryenter() for mutex_enter().

This sounds questionable to me.  If there is a reference remaining
that does not drain in a bounded time, that sounds like a bug that
should be addressed some other way.  Can you explain how you want to
use this?  Can you share preliminary diffs to opencrypto code with
localcount that motivate this?


Re: RFC: localcount_hadref() or localcount_trydarin()

2017-06-09 Thread Kengo NAKAHARA
Hi,

On 2017/06/09 17:18, Paul Goyette wrote:
> On Fri, 9 Jun 2017, Kengo NAKAHARA wrote:
>> I tried using localcount(9) for opencyrpto(9) to avoid detach encryption
>> drivers. While I implement that, I want to something to use KASSERT which
>> ensure someone have reference or not.
>> In addition, localcount_drain() waits forever if someone has had reference.
>> So, I want to function which returns in not so long time regardless of
>> someone has reference or not. In other words, I want some function like
>> mutex_tryenter() for mutex_enter().
// snip
> I think you are trying to abuse the design of localcount(9)   :)
> 
> The problem here is, whether or not you return immediately, the
> localcount has been marked for draining, and nothing else will ever
> be able to acquire it.  You've already set the totalp pointer to a
> non-NULL value, so any further attempts to call localcount_drain()
> or localcount_hadref() will fail in the KASSERT.
> 
> So, let's assume that you call localcount_hadref(); if it returns
> true you immediately call localcount_fini() (since it has already
> been drained).  But what happens if it returns false?  You can
> never go back and check the "total" again, so you will never know
> when it is safe to call localcount_fini().
> 
> 
>> In the other design, rename the function name to "localcount_trydrain()"
>> and invert return value true/false.
> 
> This has the same problem.  You WILL always drain the localcount, 
> whether or not you return quickly.  But if the localcount is not
> already drained, you will never be able to check again.
> 
> Furthermore, the totalp pointer points to a value _on_the_stack_
> and the pointer is stored in the localcount itself.  So, if you
> allow the routine to return "early", there might still be other
> places that hold references to this localcount, and they will
> eventually call localcount_release().  When that happens, they
> will try to decrement the total value, and this may corrupt the
> stack of some other routine!
> 
> 
> As I wrote in the man-page, draining the localcount is a one-shot
> operation - once you start the process of draining (by setting the
> totalp and issuing the xcall) you cannot stop the process, nor can
> you restart the process (to get an updated count).  Additionally,
> once you start the process of draining, you _must_ wait for it to
> complete;

Thank you for your detailed comments!

I read man again carefully..., yes, to the last. Hmm... I see.

I rethink opencrypto(9) design to use not only localcount(9)
but also any other tools to avoid waiting.
Anyway, thank you for your expositions.

> you cannot escape.
> 
> :)

Oh, my resistance is futile


Thanks,

-- 
//
Internet Initiative Japan Inc.

Device Engineering Section,
IoT Platform Development Department,
Network Division,
Technology Unit

Kengo NAKAHARA 


Re: RFC: localcount_hadref() or localcount_trydarin()

2017-06-09 Thread Paul Goyette

On Fri, 9 Jun 2017, Kengo NAKAHARA wrote:


Hi,

I tried using localcount(9) for opencyrpto(9) to avoid detach encryption
drivers. While I implement that, I want to something to use KASSERT which
ensure someone have reference or not.
In addition, localcount_drain() waits forever if someone has had reference.
So, I want to function which returns in not so long time regardless of
someone has reference or not. In other words, I want some function like
mutex_tryenter() for mutex_enter().

My naive implementation of that is here.

bool
localcount_hadref(struct localcount *lc, kmutex_t *interlock)
{
   int64_t total = 0;

   KASSERT(mutex_owned(interlock));
   KASSERT(lc->lc_totalp == NULL);

/* Mark it draining.  */
   lc->lc_totalp = 

   mutex_exit(interlock);
   xc_wait(xc_broadcast(0, _xc, lc, interlock));
   mutex_enter(interlock);

   /* Paranoia: Cause any further use of lc->lc_totalp to crash.  */
   lc->lc_totalp = (void *)(uintptr_t)1;

   KASSERT(total >= NULL);
   if (total == 0)
   return false;
   else
   return true;
}



I think you are trying to abuse the design of localcount(9)   :)

The problem here is, whether or not you return immediately, the
localcount has been marked for draining, and nothing else will ever
be able to acquire it.  You've already set the totalp pointer to a
non-NULL value, so any further attempts to call localcount_drain()
or localcount_hadref() will fail in the KASSERT.

So, let's assume that you call localcount_hadref(); if it returns
true you immediately call localcount_fini() (since it has already
been drained).  But what happens if it returns false?  You can
never go back and check the "total" again, so you will never know
when it is safe to call localcount_fini().



In the other design, rename the function name to "localcount_trydrain()"
and invert return value true/false.


This has the same problem.  You WILL always drain the localcount, 
whether or not you return quickly.  But if the localcount is not

already drained, you will never be able to check again.

Furthermore, the totalp pointer points to a value _on_the_stack_
and the pointer is stored in the localcount itself.  So, if you
allow the routine to return "early", there might still be other
places that hold references to this localcount, and they will
eventually call localcount_release().  When that happens, they
will try to decrement the total value, and this may corrupt the
stack of some other routine!


As I wrote in the man-page, draining the localcount is a one-shot
operation - once you start the process of draining (by setting the
totalp and issuing the xcall) you cannot stop the process, nor can
you restart the process (to get an updated count).  Additionally,
once you start the process of draining, you _must_ wait for it to
complete;  you cannot escape.

:)



+--+--++
| Paul Goyette | PGP Key fingerprint: | E-mail addresses:  |
| (Retired)| FA29 0E3B 35AF E8AE 6651 | paul at whooppee dot com   |
| Kernel Developer | 0786 F758 55DE 53BA 7731 | pgoyette at netbsd dot org |
+--+--++