Re: mod_ssl update...
On Wed, Oct 03, 2001 at 08:28:06AM -0700, Justin Erenkrantz wrote: ... Most of the other filters have their logic separated by input and output. mod_ssl combines them into one function. I think that it makes sense to split it out so that input is only handled by one function and output is handled by another function. I'm not sure why we have one function attempting to handle input/output. Is there a reason why churn() must exist? Input and output are not related to each other. For SSL, yes: they *are* related. It is possible that OpenSSL may need to send some bytes to the client in order to read some bytes. I think this is tied up with the renegotiation stuff. I don't really know *why* it would need to write to the client during a read, but I know it does. That is why Ben designed it as tied-together input and output filters. ... I don't think so..As long as we can gather the *full client data* and pass it across, OpenSSL is happy.. The catch here is to capture all the data that's sent by the client, and not to break it into small chunks/pieces. What the real deal is that OpenSSL was assumming that it'd get the socket bucket back from the core filter and it'd be able to do whatever it wanted with it. That's not possible anymore. Yup. I'm guessing the logic doesn't like running out of input. When it does, and it can't simply exit saying this is all I have right now (e.g. it is mid-step during some protocol bits), then it needs to call ap_get_brigade() again. If it is possible to have the core do the heavy lifting (i.e. reading from the socket), that is preferred. I'm just not sure how mod_ssl is reading from the socket at all. It calls ap_get_brigade *and* SSL_read - that is a violation of the input filter semantics. Either it reads from the socket on its own or it relies on the core for all of the input. I don't think we ever give it the socket. From looking at the code, it appears we do an apr_bucket_read() to get some data, write that into some kind of buffer with BIO_write(), then SSL_read() will return decrypted data based on what it finds in the BIO. It's definitely possible to use the SSL_* functions - but, then we'll have to expose the socketfd's et al.. Also, it'd be deviating from the other modules of apache, where the filters are *almost* forced to be used. I'd prefer to have the ap_get_brigade_* functionality to read/write the data from the socket. Euh... no. I believe the basic concept is sound. We simply need to fix the interaction between the brigade fetched from f-next and the BIO/SSL functions. Yes, I'd prefer it to rely on core, but there are things like *readbytes 0 that are completely bogus in the core with SSL. That's why I'm not sure we can use the core filter anymore. mod_ssl should never pass readbytes==0 to f-next. As you point out, that is bogus. So the next question is why does it do that? Simple answer is that the code is broken. If a caller asks for a line (readbytes==0), then it blindly passes that to the next filter. Instead, it should ask f-next for BUFSIZE bytes or somesuch, decode them, then parse a line out and return that. How did it work before? Nobody ever asked it for readbytes==0 through various happenstance circumstances. The API for input filters said that somebody *could* do that, but it just didn't happen. Can you say, fragile ? Thought so. Cheers, -g -- Greg Stein, http://www.lyra.org/
RE: mod_ssl update...
Hi, Pl. find my comments below : -Original Message- From: Justin Erenkrantz [..snip..] /* XXX THIS STUFF NEEDS A MAJOR CLEANUP -RSE XXX */ So, I'm obviously not the first one to think this and that was before the input filters change forced this issue. =) [..snip..] True.. it's known that the ssl filtering is to be stabilized. That's the reason the comment still exists in the file - else it would have gone long time ago :-). [..snip..] Anyway, I see that the input and output filters are handled by one function - churn. Is that dictated by the mechanics of OpenSSL? Can we separate input and output entirely or do we need to have them coexist like they are now? [..snip..] You're right.. the bulk of the SSL communication logic is done in churn().. The logic basically reads the user data from the filter, gives it to OpenSSL thru' the BIO routines, and whatever is output by openssl is picked up thru' the BIO interfaces and put on to the output queue.. I'm not clear what you mean by separate input and output.. [..snip..] I also don't think OpenSSL will like the idea of renegotiating via buckets. =) [..snip..] I don't think so..As long as we can gather the *full client data* and pass it across, OpenSSL is happy.. The catch here is to capture all the data that's sent by the client, and not to break it into small chunks/pieces. [..snip..] So, I think we have to teach mod_ssl's input filter to standalone without the help of the core. That means (I think) that we could use the SSL_* (i.e. SSL_read) functions when reading from the socket rather than ap_get_brigade/BIO_*. Can we intermix calls to BIO_* and SSL_*? Are they separate? When do we want to use BIO_* and not SSL_*? [..snip..] It's definitely possible to use the SSL_* functions - but, then we'll have to expose the socketfd's et al.. Also, it'd be deviating from the other modules of apache, where the filters are *almost* forced to be used. I'd prefer to have the ap_get_brigade_* functionality to read/write the data from the socket. [..snip..] There is just a lot of stuff here. And, I think Ralf nailed it on the head. =) [..snip..] shameless plugI picked up the SSL filtering logic from tls, and modified it to work for mod-ssl /shameless plug.. OtherBill, Doug, Cliff added some real valuable stuff to the filter code. -Madhu
Re: mod_ssl update...
On Wed, Oct 03, 2001 at 01:30:02AM -0700, MATHIHALLI,MADHUSUDAN (HP-Cupertino,ex1) wrote: You're right.. the bulk of the SSL communication logic is done in churn().. The logic basically reads the user data from the filter, gives it to OpenSSL thru' the BIO routines, and whatever is output by openssl is picked up thru' the BIO interfaces and put on to the output queue.. I'm not clear what you mean by separate input and output.. Most of the other filters have their logic separated by input and output. mod_ssl combines them into one function. I think that it makes sense to split it out so that input is only handled by one function and output is handled by another function. I'm not sure why we have one function attempting to handle input/output. Is there a reason why churn() must exist? Input and output are not related to each other. I'm not familiar with the BIO_* routines (when I was writing the SSL code for flood, the BIO_* routines did not work for me). So, I had it handle reading from the socket itself - and that seemed to work out well. I don't think so..As long as we can gather the *full client data* and pass it across, OpenSSL is happy.. The catch here is to capture all the data that's sent by the client, and not to break it into small chunks/pieces. What the real deal is that OpenSSL was assumming that it'd get the socket bucket back from the core filter and it'd be able to do whatever it wanted with it. That's not possible anymore. If it is possible to have the core do the heavy lifting (i.e. reading from the socket), that is preferred. I'm just not sure how mod_ssl is reading from the socket at all. It calls ap_get_brigade *and* SSL_read - that is a violation of the input filter semantics. Either it reads from the socket on its own or it relies on the core for all of the input. It's definitely possible to use the SSL_* functions - but, then we'll have to expose the socketfd's et al.. Also, it'd be deviating from the other modules of apache, where the filters are *almost* forced to be used. I'd prefer to have the ap_get_brigade_* functionality to read/write the data from the socket. Yes, I'd prefer it to rely on core, but there are things like *readbytes 0 that are completely bogus in the core with SSL. That's why I'm not sure we can use the core filter anymore. Since all of the data in the socket is encrypted, if readbytes == 0 is sent, it means to read a line of input data which is now bogus because the core (since it can't read the actual socket information) can't determine when a line of input data is read. Only mod_ssl has the logic to decrypt the packet information. (And look at flood to see how you get the socketfd back from the apr_socket_t...) -- justin