Re: mod_ssl update...

2001-10-04 Thread Greg Stein

On Wed, Oct 03, 2001 at 08:28:06AM -0700, Justin Erenkrantz wrote:
...
 Most of the other filters have their logic separated by input and
 output.  mod_ssl combines them into one function.  I think that
 it makes sense to split it out so that input is only handled by
 one function and output is handled by another function.  I'm not
 sure why we have one function attempting to handle input/output.
 Is there a reason why churn() must exist?  Input and output are
 not related to each other.

For SSL, yes: they *are* related. It is possible that OpenSSL may need to
send some bytes to the client in order to read some bytes. I think this is
tied up with the renegotiation stuff. I don't really know *why* it would
need to write to the client during a read, but I know it does. That is why
Ben designed it as tied-together input and output filters.

...
  I don't think so..As long as we can gather the *full client data* and pass
  it across, OpenSSL is happy.. The catch here is to capture all the data
  that's sent by the client, and not to break it into small chunks/pieces.
 
 What the real deal is that OpenSSL was assumming that it'd get the
 socket bucket back from the core filter and it'd be able to do whatever
 it wanted with it.  That's not possible anymore.

Yup. I'm guessing the logic doesn't like running out of input. When it does,
and it can't simply exit saying this is all I have right now (e.g. it is
mid-step during some protocol bits), then it needs to call ap_get_brigade()
again.

 If it is possible to have the core do the heavy lifting (i.e. reading
 from the socket), that is preferred.  I'm just not sure how mod_ssl
 is reading from the socket at all.  It calls ap_get_brigade *and*
 SSL_read - that is a violation of the input filter semantics.  Either
 it reads from the socket on its own or it relies on the core for all
 of the input.

I don't think we ever give it the socket. From looking at the code, it
appears we do an apr_bucket_read() to get some data, write that into some
kind of buffer with BIO_write(), then SSL_read() will return decrypted data
based on what it finds in the BIO.

  It's definitely possible to use the SSL_* functions - but, then we'll have
  to expose the socketfd's et al.. Also, it'd be deviating from the other
  modules of apache, where the filters are *almost* forced to be used. I'd
  prefer to have the ap_get_brigade_* functionality to read/write the data
  from the socket.

Euh... no. I believe the basic concept is sound. We simply need to fix the
interaction between the brigade fetched from f-next and the BIO/SSL
functions.

 Yes, I'd prefer it to rely on core, but there are things like *readbytes
 0 that are completely bogus in the core with SSL.  That's why I'm not 
 sure we can use the core filter anymore.

mod_ssl should never pass readbytes==0 to f-next. As you point out, that is
bogus.

So the next question is why does it do that? Simple answer is that the
code is broken. If a caller asks for a line (readbytes==0), then it blindly
passes that to the next filter. Instead, it should ask f-next for BUFSIZE
bytes or somesuch, decode them, then parse a line out and return that.

How did it work before? Nobody ever asked it for readbytes==0 through
various happenstance circumstances. The API for input filters said that
somebody *could* do that, but it just didn't happen.

Can you say, fragile ? Thought so.

Cheers,
-g

-- 
Greg Stein, http://www.lyra.org/



RE: mod_ssl update...

2001-10-03 Thread MATHIHALLI,MADHUSUDAN (HP-Cupertino,ex1)

Hi,
  Pl. find my comments below :

-Original Message-
From: Justin Erenkrantz

[..snip..]
/* XXX THIS STUFF NEEDS A MAJOR CLEANUP -RSE XXX */
So, I'm obviously not the first one to think this and that
was before the input filters change forced this issue.  =)
[..snip..]

True.. it's known that the ssl filtering is to be stabilized. That's the
reason the comment still exists in the file - else it would have gone long
time ago :-).

[..snip..]
Anyway, I see that the input and output filters are handled
by one function - churn.  Is that dictated by the mechanics
of OpenSSL?  Can we separate input and output entirely or
do we need to have them coexist like they are now?
[..snip..]

You're right.. the bulk of the SSL communication logic is done in churn()..
The logic basically reads the user data from the filter, gives it to OpenSSL
thru' the BIO routines, and whatever is output by openssl is picked up thru'
the BIO interfaces and put on to the output queue.. I'm not clear what you
mean by separate input and output..

[..snip..]
I also don't think OpenSSL will like the idea of renegotiating via buckets.
=)  
[..snip..]

I don't think so..As long as we can gather the *full client data* and pass
it across, OpenSSL is happy.. The catch here is to capture all the data
that's sent by the client, and not to break it into small chunks/pieces.

[..snip..]
So, I think we have to teach mod_ssl's input filter to standalone 
without the help of the core.  That means (I think) that we could 
use the SSL_* (i.e. SSL_read) functions when reading from the 
socket rather than ap_get_brigade/BIO_*.  Can we intermix calls to 
BIO_* and SSL_*?  Are they separate?  When do we want to use BIO_* 
and not SSL_*?
[..snip..]

It's definitely possible to use the SSL_* functions - but, then we'll have
to expose the socketfd's et al.. Also, it'd be deviating from the other
modules of apache, where the filters are *almost* forced to be used. I'd
prefer to have the ap_get_brigade_* functionality to read/write the data
from the socket.

[..snip..]
There is just a lot of stuff here.  And, I think Ralf nailed it
on the head.  =) 
[..snip..]

shameless plugI picked up the SSL filtering logic from tls, and modified
it to work for mod-ssl /shameless plug.. OtherBill, Doug, Cliff added some
real valuable stuff to the filter code.

-Madhu



Re: mod_ssl update...

2001-10-03 Thread Justin Erenkrantz

On Wed, Oct 03, 2001 at 01:30:02AM -0700, MATHIHALLI,MADHUSUDAN (HP-Cupertino,ex1) 
wrote:
 You're right.. the bulk of the SSL communication logic is done in churn()..
 The logic basically reads the user data from the filter, gives it to OpenSSL
 thru' the BIO routines, and whatever is output by openssl is picked up thru'
 the BIO interfaces and put on to the output queue.. I'm not clear what you
 mean by separate input and output..

Most of the other filters have their logic separated by input and
output.  mod_ssl combines them into one function.  I think that
it makes sense to split it out so that input is only handled by
one function and output is handled by another function.  I'm not
sure why we have one function attempting to handle input/output.
Is there a reason why churn() must exist?  Input and output are
not related to each other.

I'm not familiar with the BIO_* routines (when I was writing the
SSL code for flood, the BIO_* routines did not work for me).  So,
I had it handle reading from the socket itself - and that seemed
to work out well.

 I don't think so..As long as we can gather the *full client data* and pass
 it across, OpenSSL is happy.. The catch here is to capture all the data
 that's sent by the client, and not to break it into small chunks/pieces.

What the real deal is that OpenSSL was assumming that it'd get the
socket bucket back from the core filter and it'd be able to do whatever
it wanted with it.  That's not possible anymore.

If it is possible to have the core do the heavy lifting (i.e. reading
from the socket), that is preferred.  I'm just not sure how mod_ssl
is reading from the socket at all.  It calls ap_get_brigade *and*
SSL_read - that is a violation of the input filter semantics.  Either
it reads from the socket on its own or it relies on the core for all
of the input.

 It's definitely possible to use the SSL_* functions - but, then we'll have
 to expose the socketfd's et al.. Also, it'd be deviating from the other
 modules of apache, where the filters are *almost* forced to be used. I'd
 prefer to have the ap_get_brigade_* functionality to read/write the data
 from the socket.

Yes, I'd prefer it to rely on core, but there are things like *readbytes
0 that are completely bogus in the core with SSL.  That's why I'm not 
sure we can use the core filter anymore.  

Since all of the data in the socket is encrypted, if readbytes == 0 is 
sent, it means to read a line of input data which is now bogus because 
the core (since it can't read the actual socket information) can't
determine when a line of input data is read.  Only mod_ssl has the
logic to decrypt the packet information.  

(And look at flood to see how you get the socketfd back from the
apr_socket_t...)  -- justin