RE: renegotiating problem - connection hanging?
Hello, And, I'd like to point out one more time, we know of cases where a blocking read after a select will block. For example, if someone interposes OpenSSL between select/read/write and the OS. Someone *can* do this and people *do* do this. I'd like to point out one more time, that most of your arguments is not related to this discussion. We have two file descriptors and we relay data between them. If you build system with interposes OpenSSL between select/read/write and the OS - fine, if you build system with sophisticated timeout handing - fine, but this is unrelated to this discussion. Your next argument will be: if you do select() and space shuttle is flying ... Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, And, I'd like to point out one more time, we know of cases where a blocking read after a select will block. For example, if someone interposes OpenSSL between select/read/write and the OS. Someone *can* do this and people *do* do this. I'd like to point out one more time, that most of your arguments is not related to this discussion. Should I just take your word for it? We have two file descriptors and we relay data between them. You mean that's what you *think* you're doing. You have no way of knowing what's going on under the hood. You have no way of knowing whether, for example, encryption has been invisibly added and the fact that you don't know that is not grounds for breaking because of it. If you build system with interposes OpenSSL between select/read/write and the OS - fine, if you build system with sophisticated timeout handing - fine, but this is unrelated to this discussion. I'm sorry, how is it unrelated? Your next argument will be: if you do select() and space shuttle is flying ... I don't get your attitude. You have a set of specific guarantees. The guarantees are adequate to make your code guaranteed to work. You insist on writing code that relies on a guarantee you don't have. It breaks, because the thing that wasn't guaranteed didn't happen. I am saying that's not surprising. Lots of code built on guarantees you don't have break. I can cite example after example where people built code based on guarantees they didn't have, not able to to think of any possible way their code could break at the time, but lo and behold, later it broke. Those who stuck to the guarantees they actually had don't have this problem. When things change, they keep the guarantees you actually have. I can come up with example after example where at the time the broken code was written, *nobody* could imagine any way to break it. Everyone knew it wasn't guaranteed to work, but some people foolishly argued that it was guaranteed to work because nobody could think of a way to break it. The future *proved* them wrong. Can I say for sure it will happen in this case? No. But I have seen it happen too many times causing too much pain to not feel obligated to point it out. The code is broken. It happens to work. Tomorrow it might not happen to work. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
David Schwartz wrote: David you are bringing completely unrelated issues into the situation. No, you are failing to understand my argument. A Kernel does its job of arbitration like this on a shared/duped file descriptor that both processes have as fd=4: Thread/Process A Kernel Lower IO Layer | === select(4, [4], NULL, NULL, {10,0}) [we enter the kernel from the application context and proceed to test fd 4 for readability, we do that test atomically with the following pseudo code:] mutex_lock(flip-mutex); if((flip-events FLIP_BIT_READ_EVENT)) select_readability_fd_four = 1; else select_readability_fd_four = 0; mutex_unlock(flip-mutex); Data arrives on TCP kernel marks read event: mutex_lock(flip-mutex); flip-events |= FLIP_BIT_READ_EVENT; mutex_unlock(flip-mutex); In practice the reporting mechanism may not work exactly like this but if it did it wouldn't alter the characteristics of it. In practice there may not even be any event bitmask in use since most applications don't use select/poll for I/O. This makes select/poll the minority so they chose the other approach to do all the work inside of the select/poll system call that is needed to calculate readability/writability. Many target CPUs wont need mutexes since its possible to use machine code instructions to atomically set and reset bit patterns from a memory location (that is natually aligned). On intel i386 GNU syntax this maybe movl 0x0001,%eax; lock orl 0x012300 this would set EAX with 0x0001, and then logically OR it with memory location 0x012300, thus setting bit0. For better understanding check out atomic_set_mask() from linux kernel include/asm-i386/atomic.h. So there is set ordering of events, a well defined order. The whole point is that event trigger mechanism and the event test mechanism (which is constitutes the interaction between select/poll/read/write/etc..) : * do not cause events to be revoked reported once they are first reported to an application and not yet cleared by further read() / write(). * do not loose the posting of events during the event setting process, classic transactional lost write scenario, or race. The Kernel IO Layer: only sets readability or writability events (new data arrive in for application, output buffer is below its water mark to guarantee some form of write again) The select/poll: only looks at events in read-only fashion, it does not have the ability to set or reset events. The read() family of functions: are the only things that can clear readability events The write() family of functions: can reset writability events (buffer full is driven from application write, its reset by kernel low level i/o) It does not matter how many processes or threads you have with access to the same file descriptor. It does not matter that the select() call pre-dates what we now call threading on unix, because that is irrelevant too. A process is the original form of parallel execution, a file descriptor is inherited across fork() so two processes have access to the same file descriptor and multi-cpu machine have existed in unix for a long time. The kernel was still doing its job of arbitration then as it does now. I call again for David to prove an existing implementation of poll/select which does not confirm to the above guarantees. David is claiming that: * A readability event can disappear (after it has been first indicated by poll/select and no read() family of functions have been called, recvmsg()/recv() etc... * A writability event can disappear (after it has been first indicated by poll/select and no write() family of functions have been called, sendmsg()/send() etc... We are also only interested in condition concerning file descriptors being used for bulk read/write. We dont care a donkey about accept() or any other system calls and the quirks of a particular platform. That is off-topic. We dont care a donkey about unrelated theoretically situations like what if I call close(fd), they are irrelevant to the original discussion. The specifications for poll/select dont talk in terms of nonblocking or blocking of other system call because that does not concern the select system call. What does concern the select is the readability and writability of the file descriptor. What this meant by those terms is that the file descriptor can do more work during the next syscall call related to it. This can also mean partial writes are possible, or error return would be indicated, or indicated end-of-stream. This can be through about as there is more
Re: renegotiating problem - connection hanging?
Marek Marcola wrote: Your next argument will be: if you do select() and space shuttle is flying ... I am in complete agreement here. The crux of the situation is that I'm (we're) saying its possible to have working OpenSSL blocking mode that uses a blocking socket which conforms the to host platforms select/poll/read/write event notification characteristics. This is what I'm calling transparency. Any quirks are also transparent, any application for that host would already need to take into account those quirks, so quirks become a non-issue too. I'm saying the only reason this is the problem inside OpenSSL is that one high level API call SSL_read()/SSL_write() may end up calling two low level socket calls read()/write(), each of which may block. In my wisdom of having written IO layers before in order to keep event notification characteristics your first low-level call you make (per high level invocation) you treat it will may block. Any futher low level calls you want to make you must do so non-blocking or return -1 WANT_READ/WANT_WRITE. Since OpenSSL can almost do full non-blocking mode as every part is restartable then its 95% there to getting a transparent blocking mode since the hard bit is already done. If this is not the case I would vote to fix that too :). All you have to do inside OpenSSL is know if your underlying IO mode is blocking or non-blocking and set a flag everytime to you enter a high-level call from the application context. Then _BEFORE_ you issue any low-level I/O you test to see it your low-level IO is in blocking mode and the flag is reset. If that condition is true to return -1 and WANT_READ/WANT_WRITE depending on what you just about to try and do. Then _AFTER_ everytime you issue a low-level I/O (read or write) you reset that flag. You now get one I/O per high level call. This may also expose/fix bugs within OpenSSL where non-blocking mode is not correctly implemented and they never showed up before because 99% of the time the other end of the connection sent a packet with all the data we needed in it, so the successive read() never returned EAGAIN. So maybe OpenSSL should have 4 IO modes. * Non-blocking, * Fully-blocking (with SSL_MODE_AUTO_RETRY, based on application data through put), * Spongey-blocking (as now without SSL_MODE_AUTO_RETRY), * Transparent-blocking (one I/O per call, based on low level IO throughput). IMHO Spongey-blocking mode is of little practical purpose to anyone. It blocks when you dont want it to and may sometimes return -1 WANT_READ/WANT_WRITE anyway. I would vote to replace spongey-blocking mode with transparent-blocking if my vote ever meant anything. Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
David Schwartz wrote: David Schwartz wrote: I hate to be rude, but do you understand *anything* about programming to standards? The 'select' and 'read' functions are standardized, as is blocking and non-blocking I/O. You have the guarantees specifically enumerated in the standard and cannot assume things not assured because future implementations might violate those assumptions. I've explained why the standards you quite are written in the way they are in another thread. The standards have to speak in terms of the entire scope of the subject. The subject is the select() system all. So the select() standards have to take into account other file descriptor contexts that are not socket related and still be factually correct. This is why terms readable and writable are used. You then have to relate that to a socket file descriptor. Find me some coded proof or write a program to demonstrate the behavior you believe in. For the love of god, this whole thread started because of a program that demonstrated the behavior I believe in. Actual programs have broken because of corner cases. Again this is an incorrect belief of yours. But I can understand your point of view building another incorrect belief on top of another one is just human nature so we'll let you off. Your miss-understand there is that you think that SSL layer API blocking concept is interchangable with socket layer blocking concept. They are not. But virute of that the SSL layer is. Your incorrect belief are: * How application/kernel interacts with socket file descirptors when using select/poll/read/write. * That OpenSSL already implements a compatible/interchangable event model in relation to the above. A SSL_read() may infact issue a write() system call. Trying to tar the whole situation with the same brush is an incorrect belief. The SSL is another layer with its own wants and wishes, it does not conform to a simple data transform that can be driven by IO in one direction. Blocking mode of the higher level API calls is not interchangable with the blocking mode of the lower level system calls. But if you execute one system call per SSL level call they _DO_ become interchargable and transparent. At the moment they are not transparent. Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Darryl Miles wrote: All you have to do inside OpenSSL is know if your underlying IO mode is blocking or non-blocking and set a flag everytime to you enter a high-level call from the application context. Then _BEFORE_ you issue any low-level I/O you test to see it your low-level IO is in blocking mode and the flag is reset. If that condition is true to return -1 and WANT_READ/WANT_WRITE depending on what you just about to try and do. Then _AFTER_ everytime you issue a low-level I/O (read or write) you reset that flag. You now get one I/O per high level call. An omission. The test _BEFORE_ low level IO need to also check that SSL_MODE_AUTO_RETRY is not set (along with the other things listed above). Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
David you are bringing completely unrelated issues into the situation. David Schwartz wrote: ...SNIP... One other point, I didn't mention threads to argue that if another thread steals your data, the operation will clearly block. I mentioned it to show that it's impossible for 'select' to guarantee even that the next operation will block without breaking valid code. (Because that would require kernel omniscience to divine the intent of the programmer.) Consider: Yes we are all aware of that and its just another unrelated side track. To disarm this point too it is possible to know for sure that no other process or threads have access to your file descriptor. Duh! Also do you take threaded programming so lightly that you think you can nick and borrow file descriptors in your application willy nilly. Duh! Should that write block or not? If you really think 'select' could ever guarantee that a future operation will not block, then the kernel should remember the 'select' hit and return immediately from that 'write'. However, the implementation has no way to know that the call to write from thread B had anything to do with the call to 'select' from thread A. Perhaps the code is unrelated and thread B needs normal blocking behavior. (Think of the bizarre race conditions this would cause.) Err you do when you are a competent multi-threaded programmer, this stuff is all basic schooling. You can not use the same SSL context from two threads at the same time anyway, the high level API calls are not thread safe. And YES select can guarantee the next operation would not block in the circumstances we are talking about. Otherwise those applications would be broken by design and they are not. The situation is the _NORMAL_ single process, single thread has created a file descriptor associated with a network socket which is set in blocking mode. Nothing else on the host has access to that file descriptor because we created it since execve()/fork() was last called. There is no point complicating matters by side tracking issues concerning: * What is another thread does something with the fd * What is another process has access to the same fd (dup across fork()/exec()) All of these issues are non-starters and unrelated to the problem being discussed, we are all aware of those issues. The problem at hand is that ideally we want the two parallel blocking modes of the SSL layer to be direct equivalents to the host machines two blocking modes at the socket layer. This is allows transparency which means you application doesn't need any design change. All 3 modes I outline before form the primitive modes of operation any application programmer would want from an IO layer. I understand the semantic difference between read and write, that's not my point here. My point is that 'select' can't control what a subsequent operation does because there's no way to positively identify a particular operation as 'subsequent' and the behavior you are expecting can break code that doesn't specifically ask for it. (Though I would argue that not checking for short reads on a blocking socket is a bug too, it's also common. But that's a whole other pet peeve of mine.) Of course there is a well defined subsequent since the poll/select event system in the kernel and the file descriptor io buffers which drive those triggers have appropriate locking in place to make it well defined behaviour. Linux net/ipv4/tcp.c:319 tcp_poll() the comment above it is: /* * Wait for a TCP event. * * Note that we don't need to lock the socket, as the upper poll layers * take care of normal races (between the test and the event) and we don't * go look at any of the socket buffers directly. */ If you believe what you say is true please point at the kernel implementation that works the way you say it does. Linux does not work this way, it works the way Mikhail and I have explained. If you still believe that 'select' makes a subsequent 'read' on a socket non-blocking even if the socket is set blocking, just tell me one thing -- how do I ask for a *blocking* 'read' after a 'select' if that's what I want? (And there are certainly protocols where blocking reads could mean something, consider MSG_WAITALL.) Should I set the already-blocking socket blocking again? No its not that the next read is non-blocking. Its that the next read() has data to read or EOF or error condition to report. Because of that the next invocation of a related system call will behave not blocking. The select indicates that event is ready waiting and pending inside the kernel for the application to pull from the socket. As Mikhail pointed out in another email, you have not explained what scenario can exist where that pending event disappears ? If another process or thread issues a read()/recvfrom()/recvmsg()/recv() on that file descriptor after
Re: renegotiating problem - connection hanging?
David Schwartz wrote: No. That you cannot think of a way does not mean that no way exists. WTF ! Is dark the absence of light, or is light the absence of dark ? Please prove your way exists, there are enough poll/select implementations available to inspect. Your words have no weight without any proof. People thought that if they got a write hit from select, they could then call write without blocking. Oops, if the write was too big, they blocked. They couldn't think of a way, but they were not *guaranteed* there was no way. (And there are lots of ways this breaks on non-standard implementations too, like systems with small buffers when the other side shrinks the window.) Generally speaking in the application programing paradigm we are talking about we WANT writes to block. When we are writing we are busy doing something for the peer at the other end. But when we are reading we are idle and waiting for the other end to give us something to do but are unsure what. This plays directly into a central idle event loop driven by poll/select but when we are writing we have usually handed off to a specialized function within the application to do that specific task. The simple design of the program allows control to be where the application developer needs it. It is probably the desire for blocking writes that compels the programmer to elect a blocking socket programming paradigm over a non-blocking socket paradigm. People thought that if they got a read hit on a listening socket, they could then call accept without blocking. Oops, if the connection terminated before they called accept, they blocked. They couldn't think of a way, but they were not *guaranteed* there was no way. Transport level accept() are again out-of-scope of the discussion at hand. Thanks for your schooling on sockets but your fundamentals need more work. I mean you would have through if you put an accept() socket into non-blocking mode, then the new sockets it created would inherit that non-blocking mode too. But it don't. Nice info but completely unrelated to the issue being discussed. This is how subtle bugs and corner cases bite you on the ass. You assume that because you can think of no way to break something, it is guaranteed to work. That's not how it works. It's guaranteed to work if the relevent standard provides such a guarantee. Where is your specifications / standards on poll() or select() ? Does XOPEN, IEEE, POSIX or BSD mandate behavior ? In the absence of a specification and a relevant test certificate in hand you are left with proving your claim with an implementation. That will do me :) Can you please point at that implementation. Now, you could have a guarantee and it still not work due to thinks like bugs or trade-offs that technically violate the standard but where someone thinks that's not so important. But this weighs even more strongly in favor of my point. It is trivial to avoid these bugs and corner cases by selecting non-blocking behavior. We're getting away into chaos theory now. The butterfly that breaks wind in china, causes the earth quake in the antarctic. I've ~15 years experience in unix socket layers, yes my first 3 years were full of programming errors but I'm all better now. As for your example with write() -- of course a write of 20Mb might block, but we are not talking about write(), we are talking about read(). They do have different semantics. Same goes to your mentions of accept(). Only because people *discovered* that they do. How can you say nobody will later discover this about 'read'? Say someone adds an extension to TCP to allow one side to 'revoke' data that has not been read yet by the other side. Are you going to argue that they cannot do this? Or that POSIX prohibits it? WTF. We're out past Mars now. We're guarding todays code from hypothetical future protocol extensions that might break it. W.T.F. Don't you think the people who invent the next crazy idea didn't think about the whole problem much better than you and I, and you know lots. I can almost guarantee the existing behavior will not change with any new extensions. Just think how are you going to get the zaney idea past all those standards committees. If its a bad idea darwin will prevail. Well yes POSIX probably will prohibit it, as in the commitee will reject it; you think everyone on the POSIX commitee is going to say yes to a new thing which breaks multi-million investments in codebases. I think not. Suppose we are using SSL over a protocol that has sophisticated timeout detection. It detects a timeout and indicates that at that instant a read will not block (reporting connection close). But then, just before we call read, it gets the data it was waiting for. It then decides the connection does not need to be closed since the close was never received or acknowledged by
RE: renegotiating problem - connection hanging?
David you are bringing completely unrelated issues into the situation. No, you are failing to understand my argument. David Schwartz wrote: ...SNIP... One other point, I didn't mention threads to argue that if another thread steals your data, the operation will clearly block. I mentioned it to show that it's impossible for 'select' to guarantee even that the next operation will block without breaking valid code. (Because that would require kernel omniscience to divine the intent of the programmer.) Consider: Yes we are all aware of that and its just another unrelated side track. No, it's not. To disarm this point too it is possible to know for sure that no other process or threads have access to your file descriptor. Duh! Also do you take threaded programming so lightly that you think you can nick and borrow file descriptors in your application willy nilly. Duh! That has nothing to do with anything. The point is not what the application can know or do, the point is what the implemenation of 'read' or 'write' can know or do. One thread detecting that a socket is writable and then asking another thread to do a write is hardly borrowing file descriptors willy nilly. Should that write block or not? If you really think 'select' could ever guarantee that a future operation will not block, then the kernel should remember the 'select' hit and return immediately from that 'write'. However, the implementation has no way to know that the call to write from thread B had anything to do with the call to 'select' from thread A. Perhaps the code is unrelated and thread B needs normal blocking behavior. (Think of the bizarre race conditions this would cause.) Err you do when you are a competent multi-threaded programmer, this stuff is all basic schooling. You can not use the same SSL context from two threads at the same time anyway, the high level API calls are not thread safe. What does that have to do with my point?! My point is that 'select' cannot guarantee that a future operation will not block because there is no way to tell whether a given operation is supposed to be a future operation that umst not block or a normal operation that should block because the socket is blocking. And YES select can guarantee the next operation would not block in the circumstances we are talking about. Otherwise those applications would be broken by design and they are not. No, it cannot. I *SHOWED* *WHY*. Because there is no way that either 'select' or the subsequent operation can pair themselves (without kernel omniscience). The kernel sees a 'select', then it sees an operation on a blocking socket. The kernel has no way to know that you are thinking of that operation as subsequent to the select and I demonstrated cases where they can be incorrectly paired. So the kernel has no way to assure that that next operation will not block without breaking normal blocking semantics. The situation is the _NORMAL_ single process, single thread has created a file descriptor associated with a network socket which is set in blocking mode. Nothing else on the host has access to that file descriptor because we created it since execve()/fork() was last called. There is no point complicating matters by side tracking issues concerning: * What is another thread does something with the fd * What is another process has access to the same fd (dup across fork()/exec()) All of these issues are non-starters and unrelated to the problem being discussed, we are all aware of those issues. The same problem occurs with one thread. Consider the following code, assume blocking sockets: 1) do some stuff 2) do a huge write, don't check for short writes since our socket is blocking Now you come along and say the kernel can ensure that a select hit ensures a subsequent operation will not block. I say, what happens if I do the following: 1) do some stuff 1.5) do a 'select' to log which sockets are readable and writable for statistical purposes 2) do a huge write, don't check for short writes since our socket is blocking Your proposal, having 'select' ensure the subsequent 'write' does not block will *break* my code. And my reply to you would be, what should I do? request blocking sockets again with an 'I REALLY MEAN IT THIS TIME' flag? The problem at hand is that ideally we want the two parallel blocking modes of the SSL layer to be direct equivalents to the host machines two blocking modes at the socket layer. This is allows transparency which means you application doesn't need any design change. I agree. 'SSL_read' should block until application data is avialable, just as a TCP 'read' does. Of course there is a well defined subsequent since the poll/select event system in the kernel and the file descriptor io buffers which drive those triggers have
RE: renegotiating problem - connection hanging?
David Schwartz wrote: No. That you cannot think of a way does not mean that no way exists. WTF ! Is dark the absence of light, or is light the absence of dark ? Please prove your way exists, there are enough poll/select implementations available to inspect. Your words have no weight without any proof. I hate to be rude, but do you understand *anything* about programming to standards? The 'select' and 'read' functions are standardized, as is blocking and non-blocking I/O. You have the guarantees specifically enumerated in the standard and cannot assume things not assured because future implementations might violate those assumptions. It is probably the desire for blocking writes that compels the programmer to elect a blocking socket programming paradigm over a non-blocking socket paradigm. This is how subtle bugs and corner cases bite you on the ass. You assume that because you can think of no way to break something, it is guaranteed to work. That's not how it works. It's guaranteed to work if the relevent standard provides such a guarantee. Where is your specifications / standards on poll() or select() ? Does XOPEN, IEEE, POSIX or BSD mandate behavior ? In the absence of a specification and a relevant test certificate in hand you are left with proving your claim with an implementation. That will do me :) Can you please point at that implementation. What the hell are you talking about? The relevent standard sections have been pasted. They do not guarantee that a future operation will not block. I explained why they cannot do so. What more do you need? I've ~15 years experience in unix socket layers, yes my first 3 years were full of programming errors but I'm all better now. Glad to hear it. WTF. We're out past Mars now. We're guarding todays code from hypothetical future protocol extensions that might break it. W.T.F. That is how you write good programs. You assume only what is guaranteed by the specifications because you know that future implementations will be different. All it takes to break your code is an error condition that can clear, say due to packets being received. I can almost guarantee the existing behavior will not change with any new extensions. Just think how are you going to get the zaney idea past all those standards committees. If its a bad idea darwin will prevail. Well yes POSIX probably will prohibit it, as in the commitee will reject it; you think everyone on the POSIX commitee is going to say yes to a new thing which breaks multi-million investments in codebases. I think not. How is this an argument for making code more fragile by relying on guarantees that don't exist? Well, it will break a lot of stuff, so creating a few more things that will break is okay. That's how you wind up with a *lot* of broken code when things really have to change. We are not on some crusade to say that you must use a blocking socket model. Both models have their strengths and weaknesses, non-blocking is more complex, blocking is less complex. You do however appear to be one some sort of crusade against. I am not against blocking socket models, they just block. Sometimes that's appropriate, sometimes it's not. What I am against are half-assed models that usually don't block, are required to never block, but might block if the unexpected happens. Then the unexpected happens as it did in this very case, and the application breaks. This is a precise case of the very breakage you insist cannot happen. The 'select' call gets a read hit because data is available. Later the implementation changes its mind when on closer inspection it appears the data is not the right data to satisfy the 'read'. So the 'read' blocks. You claim this can never happen when IT JUST DID. The reason we are having this conversation is because we found a case where the assumption breaks -- when the transport can hold both application and non-application data. Yet you still claim it can never happen. I'm baffled. This same argument would apply to any protocol that can report non-fatal errors. You think POSIX designed 'poll' and 'select' to make non-fatal errors impossible? Euh. A non-fatal error, these are conditions which are part of the normal and expected working of the resource: * End of Stream/File * No Data available EAGAIN * Signal interrupted us EINTR A Fatal Error is, these are unexpected, catastropic errors relating to the resource: * EBADF * EINVAL * EFAULT Consider. A non-fatal error occurs on a stream, say a timeout. You get a 'select' hit on read because at that instant a 'read' will not block. Then, the error clears, maybe the waited for packet is received. So when you call 'read', there is no error and no data -- it blocks. Please show me where the standards for 'select' and 'read' guarantee
RE: renegotiating problem - connection hanging?
--On Wednesday, June 21, 2006 3:36 PM -0700 David Schwartz [EMAIL PROTECTED] wrote: The same problem occurs with one thread. Consider the following code, assume blocking sockets: 1) do some stuff 2) do a huge write, don't check for short writes since our socket is blocking That code is broken. Fix it. You must _always_ check for short writes. Not doing so is buggy code. Nothing in POSIX, SUSvn, or any other standard requires that write blocks until everything is written. And on many operating systems, it won't. -- Carson __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
The same problem occurs with one thread. Consider the following code, assume blocking sockets: 1) do some stuff 2) do a huge write, don't check for short writes since our socket is blocking That code is broken. Fix it. You must _always_ check for short writes. Not doing so is buggy code. Nothing in POSIX, SUSvn, or any other standard requires that write blocks until everything is written. That is my whole point. Nothing in POSIX, SUSvn, or any other standard requires that a read after a select not block. So assuming it won't is buggy code. You must always set a socket non-blocking if you want a socket not to block because nothing else guarantees you won't block, just as setting a socket blocking does not guarantee that it will block until all the data is sent. And on many operating systems, it won't. That's really not the issue. The point is that on any system it could. Even if no system currently existed that had this property, the behavior is still not guaranteed, and that's what matters. In fact, assuming no signals, I don't know of any operating system that does have short writes (except in the face of fatal errors) on blocking TCP sockets. However, the fact that I don't know of (or even can't imagine) such a system doesn't provide me a *guarantee*. Now if there was no way at all to get a guarantee, we'd just be screwed and would have no choice but to hope for the best. But in this case, we have an easy way to get the guarantee. So assuming would be error. Assuming blocking TCP sockets won't show a short write unless there's a signal or fatal error is precisely the same error as assuming a blocking read after a select won't block. And, I'd like to point out one more time, we know of cases where a blocking read after a select will block. For example, if someone interposes OpenSSL between select/read/write and the OS. Someone *can* do this and people *do* do this. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Sorry to come at this thread a week later. I can see both sides of the problem here, there are infact three distinct modes of operation: Non-Blocking: There is where the underlying socket descriptor is in non-blocking mode. SSL API calls return -1 and set SSL_get_errno() to allow for EAGAIN or ERROR to be handled. Everybody agreed in the thread on this behavior. Blocking (with SSL_MODE_AUTO_RETRY): This is where OpenSSL can use as many blocking read() call as it wants in order complete application level data processing with the transport layer. Blocking (without SSL_MODE_AUTO_RETRY): This is the default mode for blocking socket, if I understand the comments correctly. In this mode only one blocking read() system call is allowed by the SSL library, if that block thats ok, if it doesn't block and returns data the SSL API call must return control back to applicaton or if the SSL library still wants to call read() for more data another time it _MUST_ do so in a non-blocking fashion. All 3 modes have valid use cases when you have a layer between transport and application. It is not correct to be talking about blocking transport layer API and blocking SSL layer API as being interchangable; these 3 modes cater for every usage case. Two of the modes are direct equivalents to the transport layer blocking/nonblocking modes, the third deals with the disparity of the extra SSL layer in a elegant way. The problem this thread is covering is the Blocking (without SSL_MODE_AUTO_RETRY) case: If I understand correctly the original thread poster was explaining a bug in using OpenSSL s_client triggered during a renegotiation (was this client or server initiated? SGC related?). It is unclear to me if Marek thinks this problem is due to a library bug or simply that s_client should be clearing SSL_MODE_AUTO_RETRY in its block socket use case for that program. We all know that OpenSSL s_client has a command line option to enable nonblocking mode so the discussion about we should be using non-blocking is bogus in this situation, maybe this should be the default for s_client anyway. If its a library bug I believe Marek is saying it is wrong that OpenSSL should be re-issuing a read() blocking system call for a 2nd time (within the same SSL_() API call) when SSL_MODE_AUTO_RETRY is not set. It should return -1 and WANT_READ to the application layer. If OpenSSL really wants to for a 2nd time make a read() system call it should use poll/select on the transport layer or temporarily mark the socket non-blocking to guard against a blocking system call before. This is because the application never authorized it to use more than one blocking system call because SSL_MODE_AUTO_RETRY is not set. Marek is correct in that system calls return EINTR and other returns which are valid non-fatal-error returns to blocking calls, the handling of writes simply allow a partial write to occur in the SSL API, so a -1 return to a blocking SSL API call is valid too. I also agree that if Marek's finding is true it should not break any existing applications that are correctly written, that is when an SSL API call returns -1 you find out why with SSL_get_errno() and deal with the situation appropriately. Maybe both parties could point out the errors in the summary above. Everyone seems to be correct in their comments but I'm not sure they are seeing the same view of the problem as I did when reading the thread. Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
If I understand correctly the original thread poster was explaining a bug in using OpenSSL s_client triggered during a renegotiation (was this client or server initiated? SGC related?). - client sends some data - server initiates renegotiation and immediately after that sends some data back to the client It is unclear to me if Marek thinks this problem is due to a library bug or simply that s_client should be clearing SSL_MODE_AUTO_RETRY in its block socket use case for that program. We all know that OpenSSL s_client has a command line option to enable nonblocking mode so the discussion about we should be using non-blocking is bogus in this situation, maybe this should be the default for s_client anyway. Well, if s_client is broken in the blocking mode maybe it should be removed completely. I did test it in the non-blocking mode and, of course, it does not have the described error. If its a library bug I believe Marek is saying it is wrong that OpenSSL should be re-issuing a read() blocking system call for a 2nd time (within the same SSL_() API call) when SSL_MODE_AUTO_RETRY is not set. It should return -1 and WANT_READ to the application layer. If OpenSSL really wants to for a 2nd time make a read() system call it should use poll/select on the transport layer or temporarily mark the socket non-blocking to guard against a blocking system call before. This is because the application never authorized it to use more than one blocking system call because SSL_MODE_AUTO_RETRY is not set. I think you summary is correct. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Mikhail Kruk wrote: It is unclear to me if Marek thinks this problem is due to a library bug or simply that s_client should be clearing SSL_MODE_AUTO_RETRY in its block socket use case for that program. We all know that OpenSSL s_client has a command line option to enable nonblocking mode so the discussion about we should be using non-blocking is bogus in this situation, maybe this should be the default for s_client anyway. Well, if s_client is broken in the blocking mode maybe it should be removed completely. I did test it in the non-blocking mode and, of course, it does not have the described error. So are you saying the bug is: * in s_client (for not correctly handling the SSL layer APIs) or * the bug is in the SSL library (for issuing 2 blocking system calls when SSL_MODE_AUTO_RETRY is not set within the same high level SSL layer API call SSL_read() in this particular case) or * you didn't have time to nail down the precise cause If the bug is in the SSL library then s_client may not be broken, its simply exposing a bug in a corner case. Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Well, if s_client is broken in the blocking mode maybe it should be removed completely. I did test it in the non-blocking mode and, of course, it does not have the described error. So are you saying the bug is: * in s_client (for not correctly handling the SSL layer APIs) or * the bug is in the SSL library (for issuing 2 blocking system calls when SSL_MODE_AUTO_RETRY is not set within the same high level SSL layer API call SSL_read() in this particular case) or * you didn't have time to nail down the precise cause If the bug is in the SSL library then s_client may not be broken, its simply exposing a bug in a corner case. My first reaction was that this is a bug in the library, but I didn't feel very strong about it and would have accepted that this is just a bug in s_client. I like your argument about the library not having the right to make 2 blocking calls unless retry is set and now I'm back to thinking that this should be fixed in the library. I'm pretty confident that the fix (if it is feasible) is not going to break any correct application code. And I'm pretty confident that it is going to make app. engineer's life easier. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Mikhail Kruk wrote: I'm pretty confident that the fix (if it is feasible) is not going to break any correct application code. And I'm pretty confident that it is going to make app. engineer's life easier. My view comes from what is architecturally scalable by design which makes it recursive in nature. What I mean by this is that any IO layer that implements exactly those 3 modes would allow such IO layers to be stacked in one long pipe. And yes exactly; the whole point is to make the application end simple and to leave the complexity inside the library. Darryl __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
My first reaction was that this is a bug in the library, but I didn't feel very strong about it and would have accepted that this is just a bug in s_client. I like your argument about the library not having the right to make 2 blocking calls unless retry is set and now I'm back to thinking that this should be fixed in the library. I'm pretty confident that the fix (if it is feasible) is not going to break any correct application code. And I'm pretty confident that it is going to make app. engineer's life easier. Why is the number of blocking calls significant? How are two blocking calls different from one? If anyone thinks that 'select' or 'poll' guarantees that a future operation will not block, even if it's a single operation, that's just plain not true. The only way you can guarantee that even one operation will not block is if you set the socket non-blocking. I can't quite see the point of not setting auto-retry. You make blocking socket operations when you want to block until they can be completed. What possible use would half-blocking be? To try as hard as possible to find the corner cases where you think you won't block but then accidentally do? Please show me the standard that says that the return value from 'select' or 'poll' guarantees that a future operation will not block. It cannot be done. Operations on blocking sockets can *always* block. That's why they're called 'blocking sockets'. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
David Schwartz wrote: My first reaction was that this is a bug in the library, but I didn't feel very strong about it and would have accepted that this is just a bug in s_client. I like your argument about the library not having the right to make 2 blocking calls unless retry is set and now I'm back to thinking that this should be fixed in the library. I'm pretty confident that the fix (if it is feasible) is not going to break any correct application code. And I'm pretty confident that it is going to make app. engineer's life easier. Why is the number of blocking calls significant? How are two blocking calls different from one? Its to do with the application programming paradigm for doing IO. You start with a working program using raw syscalls. We can use poll/select to drive the need to service IO requirements. Now you want to add OpenSSL to is. The claim here is that now you get an IO programming paradigm which doesn't conform to the model your program is already using for blocking IO. Which works on the platform its on when there appears to be no need for that to be the case. Instead now the concept of blocking IO goes under the heading voodoo and black magic. When I've not heard one reason why that has to be the case. If anyone thinks that 'select' or 'poll' guarantees that a future operation will not block, even if it's a single operation, that's just plain not true. The only way you can guarantee that even one operation will not block is if you set the socket non-blocking. Really. I dont think the IEEE and POSIX specifications are publicly available to quote you an extract. However the Linux man page talks in those specific terms of a descriptor becoming writingable means that a write operation will not block (I think its widely accepted that they are referring to only the next single invocation of write() will not block). I believe a version of Linux (around 2.0 or 2.2 passed POSIX testing) and I don't believe the behaviour of the current kernel has changed since that time. So while I don't have the exact specs you want to hand I'm confident the implementation of these syscalls within linux is correct. But obviouly the Linux implementation isn't the standard. I can't quite see the point of not setting auto-retry. You make blocking socket operations when you want to block until they can be completed. What possible use would half-blocking be? To try as hard as possible to find the corner cases where you think you won't block but then accidentally do? But thats the point. Its not about half-blocking its about the interation between poll/select events and the OpenSSL library so that the library more easily slots into application programs which use conform to that platforms blocking socket IO model. I am expecting the library to just extend that. I have not heard any reason why it needs to be the way it is. Maybe if OpenSSL implemented an event notification scheme which would slot into the poll/select model there would not be a problem. But thats not the case and I've heard no technical reason why a half-blocking mode not possible. Please show me the standard that says that the return value from 'select' or 'poll' guarantees that a future operation will not block. It cannot be done. Operations on blocking sockets can *always* block. That's why they're called 'blocking sockets'. Maybe you can show me your standards that cover the select() system call you have ? What standards are you working to ? If there are no standards maybe you can demonstrate an implementation of the select/poll system call that doesn't work that way. Then explain how normal blocking IO on sockets works on that platform (including interaction with signal handling, transport layer errors, buffers becoming full, etc..). I'm pretty sure form that all I can describe to you how OpenSSL would work for that platform in that situation. I'm thinking along the lines of transparency of that platforms blocking socket IO model, and less black magic voodoo (which only bite developers in the ass). In my limited wisdom this simply boils down to allowing just one blocking lower layer IO to be performed per higher level call in. Anything else needs to be restartable, since OpenSSL support non-blocking mode its already got the infrastructure in place to be restartable. In a way its cheeky of you to be asking for standards to backup the syscall as if it has any bearing or weight in this thread. The thread comes from the more practical approach of how systems are and how implementations are of existing application software. The request is just for the operating systems blocking design paradigm to be maintained through to the high level SSL API calls (SSL_read()/SSL_write()) etc.. It would be true to say that if the platform doesn't support the poll/select blocking socket paradigm we are expecting then that
Re: renegotiating problem - connection hanging?
If anyone thinks that 'select' or 'poll' guarantees that a future operation will not block, even if it's a single operation, that's just plain not true. The only way you can guarantee that even one operation will not block is if you set the socket non-blocking. Really. I dont think the IEEE and POSIX specifications are publicly available to quote you an extract. However the Linux man page talks in those specific terms of a descriptor becoming writingable means that a write operation will not block (I think its widely accepted that they are referring to only the next single invocation of write() will not block). Linux: Three independent sets of descriptors are watched. Those listed in readfds will be watched to see if characters become available for read- ing (more precisely, to see if a read will not block - in particular, a file descriptor is also ready on end-of-file) R. Stevens, Unix Network Programming, Volume 1, Second Edition, Section 6.3, page 153: 1. A socket is ready for reading if any of the following four conditions is true: a. The number of bytes of data in the socket receive buffer is greater than or equal to the current size of the low-water mark for the socket receive buffer. *** A read operation on the socket will not block and will return a value greater than 0 (i.e., the data that is ready to be read).*** We can set this low-water mark using the SO_RCVLOWAT socket option. It defaults to 1 for TCP and UDP sockets. b. The read half of the connection is closed (i.e., a TCP connection that has received a FIN). A read operation on the socket will not block and will return 0 (i.e., EOF). c. The socket is a listening socket and the number of completed connections is nonzero. An accept on the listening socket will normally not block, although we will describe a timing condition in Section 16.6 under which the accept can block. d. A socket error is pending. A read operation on the socket will not block and will return an error (.1) with errno set to the specific error condition. These pending errors can also be fetched and cleared by calling getsockopt and specifying the SO_ERROR socket option. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
If anyone thinks that 'select' or 'poll' guarantees that a future operation will not block, even if it's a single operation, that's just plain not true. The only way you can guarantee that even one operation will not block is if you set the socket non-blocking. Really. I dont think the IEEE and POSIX specifications are publicly available to quote you an extract. They talk about a hypothetical operation and whether or not it would have blocked. However the Linux man page talks in those specific terms of a descriptor becoming writingable means that a write operation will not block (I think its widely accepted that they are referring to only the next single invocation of write() will not block). If that's what they're referring to, they're obviously in error. If the next write, for example, is for 982,320 bytes, it most certainly *will* block if the socket is blocking. Maybe you can show me your standards that cover the select() system call you have ? What standards are you working to ? If there are no standards maybe you can demonstrate an implementation of the select/poll system call that doesn't work that way. Then explain how normal blocking IO on sockets works on that platform (including interaction with signal handling, transport layer errors, buffers becoming full, etc..). I'm pretty sure form that all I can describe to you how OpenSSL would work for that platform in that situation. The return value from a 'select' call does not guarantee any future results, nor could it. That would insanely require the operating system to remember that you got some information from 'select' and modify the next blocking operation to be non-blocking. How this would work in a multi-threaded case is quite puzzling. If thread 1 gets a write hit from select, then thread 2 tries to write 192,903 should it block or not? How can the implementation know whether the write from thread 2 is expecting the 'select' information thread 1 got to mean it shouldn't block? That's crazy. If you want non-blocking behavior, the only thing that assures it is non-blocking sockets. For example, SuSv2 documents 'select' like this: http://www.opengroup.org/onlinepubs/007908799/xsh/select.html Search it for the phrase 'block' and you will see lots of stuff about when and how 'select' blocks but nothing about some future operation not blocking. I'm thinking along the lines of transparency of that platforms blocking socket IO model, and less black magic voodoo (which only bite developers in the ass). There are just too many corner cases. It is simply not possible to assure that you will never block if you perform operations on blocking sockets. In a way its cheeky of you to be asking for standards to backup the syscall as if it has any bearing or weight in this thread. The thread comes from the more practical approach of how systems are and how implementations are of existing application software. The request is just for the operating systems blocking design paradigm to be maintained through to the high level SSL API calls (SSL_read()/SSL_write()) etc.. It would be true to say that if the platform doesn't support the poll/select blocking socket paradigm we are expecting then that program (you are adding OpenSSL) won't be written in that way. This makes what the exact specification of poll/select is in relation to blocking IO irrelevant, we just want transparency of whatever that model is at syscall level on that platform brought out to the SSL API calls. I agree with you. My position is that SSL_read should behave just like read, blocking until it can return data to the caller. A blocking 'read' or 'SSL_read' must only be made when you *know* there will be data to be returned. So there is no point in getting tangled up on the topic of select/poll syscall specification. Well, if the argument is that 'SSL_read' should have different semantics from 'read' because doing so is needed to allow you to assure that you will never block even if you use a blocking socket, it is important to show that you do not have this guarantee with 'select' and 'read', so the fact that you don't have it with 'select' and 'SSL_read' is not a semantic difference. (Just a legal corner case that happens much more frequently.) The error is this simple -- you cannot call 'read' on a blocking TCP socket or 'SSL_read' on a blocking SSL socket unless you *know* that there is appplication level data to be returned or you may risk blocking forever, any possible return value from 'select' or 'poll' notwithstanding. (Because 'select' and 'poll' *never* guarantee what a future operation will do. POSIX does not say so. SuS does not say so.) The Linux man page seems to say so, and I don't know if it's simple error or just bad phrasing. For example, does will not block mean will not block in a subsequent operation? If
RE: renegotiating problem - connection hanging?
Linux: Three independent sets of descriptors are watched. Those listed in readfds will be watched to see if characters become available for read- ing (more precisely, to see if a read will not block - in particular, a file descriptor is also ready on end-of-file) You'll notice that POSIX and SuS do not say this. The problem with this is that it seems like they're talking about some 'read' in the future, but they're actually talking about a hypothetical concurrent read. The kernel does not remember that it returned a read indication and issue a subsequent read operation as non-blocking. I think the authors of this man page simple misunderstood the standards. When the standards at that time spoke of a read that would not block they didn't mean after 'select' returned, they meant after 'select' was invoked. (See below.) R. Stevens, Unix Network Programming, Volume 1, Second Edition, Section 6.3, page 153: 1. A socket is ready for reading if any of the following four conditions is true: a. The number of bytes of data in the socket receive buffer is greater than or equal to the current size of the low-water mark for the socket receive buffer. *** A read operation on the socket will not block and will return a value greater than 0 (i.e., the data that is ready to be read).*** That does not mean that you can issue any read operation at any later time and it's guaranteed not to block. It just means that at some instant in-between when you called 'select' and when it returned, a hypothetical 'read' operation would not have blocked. By the time you get the information, it's out of date, just like 'access' and pretty much every other status function. It uses the term will because it means at the time the socket was determined to be ready for reading, not some later time when you noticed that. The 'select' function is just like any other status function. If you ask the kernel the size of a file and it says 154,323 bytes, does that mean it's 154,323 bytes now? All it means is that at some point in-between when you called the function and when it returned, the size *was* 154,323 bytes. If there was some function that blocked until a file was a particular size, it could well be documented as this function will return a succcess indication if a 'stat' function would return the size waited for. It could even say, a match is generated if a 'stat' function will return the size waited for. Does this mean a *subsequent* 'stat' function *must* return that size? OF COURSE NOT, and it's sheer madness to suggest otherwise. Seriously, these are harmful assumptions that break code. You simply don't have the guarantee you claim you have. It may be true some or most of the time, but it's not guaranteed and it's error to rely on it. OpenSSL does everyone a disservice by trying to force it to work somehow. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Hello, If anyone thinks that 'select' or 'poll' guarantees that a future operation will not block, even if it's a single operation, that's just plain not true. The only way you can guarantee that even one operation will not block is if you set the socket non-blocking. Really. I dont think the IEEE and POSIX specifications are publicly available to quote you an extract. However the Linux man page talks in those specific terms of a descriptor becoming writingable means that a write operation will not block (I think its widely accepted that they are referring to only the next single invocation of write() will not block). Linux: Three independent sets of descriptors are watched. Those listed in readfds will be watched to see if characters become available for read- ing (more precisely, to see if a read will not block - in particular, a file descriptor is also ready on end-of-file) R. Stevens, Unix Network Programming, Volume 1, Second Edition, Section 6.3, page 153: 1. A socket is ready for reading if any of the following four conditions is true: a. The number of bytes of data in the socket receive buffer is greater than or equal to the current size of the low-water mark for the socket receive buffer. *** A read operation on the socket will not block and will return a value greater than 0 (i.e., the data that is ready to be read).*** We can set this low-water mark using the SO_RCVLOWAT socket option. It defaults to 1 for TCP and UDP sockets. b. The read half of the connection is closed (i.e., a TCP connection that has received a FIN). A read operation on the socket will not block and will return 0 (i.e., EOF). c. The socket is a listening socket and the number of completed connections is nonzero. An accept on the listening socket will normally not block, although we will describe a timing condition in Section 16.6 under which the accept can block. d. A socket error is pending. A read operation on the socket will not block and will return an error (.1) with errno set to the specific error condition. These pending errors can also be fetched and cleared by calling getsockopt and specifying the SO_ERROR socket option. Sorry for late response but I was out of office. In s_client we have TCP blocking socket and SSL stack on it working with SSL_MODE_AUTO_RETRY mode off. In this mode (according to OpenSSL documentation) we may received SSL_ERROR_WANT_READ if renegotiation occurs. From OpenSSL SSL_read() documentation: -- If the underlying BIO is blocking, SSL_read() will only return, -- once the read operation has been finished or an error occurred, -- except when a renegotiation take place, in which case a -- SSL_ERROR_WANT_READ may occur. So application should be prepared for this event - because this may happen. If application will not support this situation, when SSL renegotiation occurs, application may simply drop connection thinking that some critical error occured. In discussed case we have started renegotiation (handshake packets exchange) and ONE PACKET with application data IN RENEGOTIATION. First SSL_read() after select(): - read hello_request (handshake - renegotiation) - write client_hello (handshake - renegotiation) - read UNEXPECTED ENCRYPTED APPLICATION DATA this data may be returned to caller because OpenSSL has support for such case. Second SSL_read() after select(): - read/write rest of renegotiation packets - read for application data -- and here we have HANG I made assumption: - s_client wants to work in this SSL mode - s_client is prepared for SSL_ERROR_WANT_READ event (because it is :-) so solution for this may be return from SSL_read() with SSL_ERROR_WANT_READ when full/partial renegotiation is performed. Of course, we may discuss when we should do this, after reading one handshake packet, or after all renegotiation, or in other way, but past discussion develop in wrong way: - you're screwed - no, you're screwed - no, no, you're screwed - no, no, no, you're screwed But I'm talking about facts and real implementation - not theory. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
You are now introducing some weirdness into our little blocking world. Threads and other scary stuff. Yes, if a gremlin reads the data from the buffer between calls to select() and read(), the read() call might block. But if we assume that there is only one process with single thread using the socket, isn't it a valid assumption that if at some point during the select() call the data appeared in the buffer, there was *no way* for it to disappear from the buffer without us calling read()? No. That you cannot think of a way does not mean that no way exists. People thought that if they got a write hit from select, they could then call write without blocking. Oops, if the write was too big, they blocked. They couldn't think of a way, but they were not *guaranteed* there was no way. (And there are lots of ways this breaks on non-standard implementations too, like systems with small buffers when the other side shrinks the window.) People thought that if they got a read hit on a listening socket, they could then call accept without blocking. Oops, if the connection terminated before they called accept, they blocked. They couldn't think of a way, but they were not *guaranteed* there was no way. This is how subtle bugs and corner cases bite you on the ass. You assume that because you can think of no way to break something, it is guaranteed to work. That's not how it works. It's guaranteed to work if the relevent standard provides such a guarantee. Now, you could have a guarantee and it still not work due to thinks like bugs or trade-offs that technically violate the standard but where someone thinks that's not so important. But this weighs even more strongly in favor of my point. It is trivial to avoid these bugs and corner cases by selecting non-blocking behavior. As for your example with write() -- of course a write of 20Mb might block, but we are not talking about write(), we are talking about read(). They do have different semantics. Same goes to your mentions of accept(). Only because people *discovered* that they do. How can you say nobody will later discover this about 'read'? Say someone adds an extension to TCP to allow one side to 'revoke' data that has not been read yet by the other side. Are you going to argue that they cannot do this? Or that POSIX prohibits it? Suppose we are using SSL over a protocol that has sophisticated timeout detection. It detects a timeout and indicates that at that instant a read will not block (reporting connection close). But then, just before we call read, it gets the data it was waiting for. It then decides the connection does not need to be closed since the close was never received or acknowledged by the application. Now the read blocks. Are you saying such a protocol cannot exist? Or that 'select' cannot work with such a protocol? Or that OpenSSL should break on such a protocol? This same argument would apply to any protocol that can report non-fatal errors. You think POSIX designed 'poll' and 'select' to make non-fatal errors impossible? We have a method that is guaranteed to work -- making the socket non-blocking. After bringing in threads, write() and accept() calls into this discussion you have to either mention signals or accept that you are out of arguments. I mentioned threads not to argue that another thread might read the data but to show that it's impossible for select to make a subsequent operation non-blocking because there is no reliable way to pair select operations with subsequent operations. Simply put, you are guaranteed that 'read' will not block after a 'select' hit for read if and only if two things are provably the case: 1) Nothing can change of significance in-between when 'select' returns a read hit and you call read. 2) The test for an immediate return from 'read' is precisely identical for the test for a read hit from 'select'. Can you find what guarantees or assures these two factors? 'SSL_read' should have the same semantics as 'read'. For a blocking socket, it should block until data can be returned and should only be used when you are willing to block until there is such data. For a non-blocking socket, it should do at least some work if any is possible (ideally as much as is possible) and if none is possible without blocking, it should return an appropriate indication. However, it is a serious error to promote bad programming practice by pretending that you can use 'select' or 'poll' to ensure that a future operation on a blocking socket will not block. You simply cannot do this. It does not and has not ever worked. This is a case where you have a trivial way to guarantee the behavior you want, but instead you rely on a technique not guaranteed to provide the behavior you want. Surprise, you don't get the behavior you want. The solution is to fix your
RE: renegotiating problem - connection hanging?
One more point, and then I'll try to shut up. ;) You could argue that we could just fix this and deprecate fake non-blocking I/O for future major versions. The argument would be that this won't break any application that's not broken already and might fix existing applications. My response to that would be that there might be applications that erroneously call SSL_read expecting it to block until application data is available because it has before in their testing. They can argue that they set the socket blocking specifically so that it would block. (It seems the cases in which SSL_read blocks after a read select hit are consistent given consistent timing, so code could rely on it.) Is their code broken already? I suppose, because they wanted retries and didn't ask for them. But your code wanted non-blocking and didn't ask for it. I would argue that a change that can only fix definitely broken code and can break only probably broken code isn't worth making. There's an obvious way to get total fixing in a subsequent version, match the semantics of 'read' in 'SSL_read'. One other point, I didn't mention threads to argue that if another thread steals your data, the operation will clearly block. I mentioned it to show that it's impossible for 'select' to guarantee even that the next operation will block without breaking valid code. (Because that would require kernel omniscience to divine the intent of the programmer.) Consider: 1) Thread A get a write hit from 'select' for blocking socket 9. 2) Thread B does a 129,029 byte 'write' to socket 9. (It may or may not be relying on the write hit from 'select', the implementation cannot tell.) Should that write block or not? If you really think 'select' could ever guarantee that a future operation will not block, then the kernel should remember the 'select' hit and return immediately from that 'write'. However, the implementation has no way to know that the call to write from thread B had anything to do with the call to 'select' from thread A. Perhaps the code is unrelated and thread B needs normal blocking behavior. (Think of the bizarre race conditions this would cause.) I understand the semantic difference between read and write, that's not my point here. My point is that 'select' can't control what a subsequent operation does because there's no way to positively identify a particular operation as 'subsequent' and the behavior you are expecting can break code that doesn't specifically ask for it. (Though I would argue that not checking for short reads on a blocking socket is a bug too, it's also common. But that's a whole other pet peeve of mine.) If you still believe that 'select' makes a subsequent 'read' on a socket non-blocking even if the socket is set blocking, just tell me one thing -- how do I ask for a *blocking* 'read' after a 'select' if that's what I want? (And there are certainly protocols where blocking reads could mean something, consider MSG_WAITALL.) Should I set the already-blocking socket blocking again? There's an obvious common-sense way to resolve this, and pretty much only one way -- if the application wants non-blocking behavior, it has to ask for it. If it asks for blocking behavior, it should get that. Okay, I'll shut up now. This is just one of my pet peeves because it's a bug I have to frequently track down and fix and I'm getting tired of people evangelizing *for* the bug and encouraging people to make it. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
On Tue, Jun 20, 2006 at 07:50:24PM -0700, David Schwartz wrote: You could argue that we could just fix this and deprecate fake non-blocking I/O for future major versions. The argument would be that this won't break any application that's not broken already and might fix existing applications. Regardless of the merits of single read() per SSL-read() or not. Two observations: 1. The code path in question only applies to SSLv3 and TLSv1. The SSLv2 code does not look at the RETRY flag, likely because SSLv2 does not support data in the middle of a handshake. 2. The code is question is ssl3_read_internal(): static int ssl3_read_internal(SSL *s, void *buf, int len, int peek) { int ret; clear_sys_error(); if (s-s3-renegotiate) ssl3_renegotiate_check(s); s-s3-in_read_app_data=1; ret=s-method-ssl_read_bytes(s,SSL3_RT_APPLICATION_DATA,buf,len,peek); if ((ret == -1) (s-s3-in_read_app_data == 2)) { /* ssl3_read_bytes decided to call s-handshake_func, which * called ssl3_read_bytes to read handshake data. * However, ssl3_read_bytes actually found application data * and thinks that application data makes sense here; so disable * handshake processing and try to read application data again. */ s-in_handshake++; ret=s-method-ssl_read_bytes(s,SSL3_RT_APPLICATION_DATA,buf,len,peek); s-in_handshake--; } else s-s3-in_read_app_data=0; return(ret); } That second ssl3_read_bytes() is called when in_read_app_data == 2, which is set when data arrives when a handshake was expected. case SSL3_RT_APPLICATION_DATA: /* At this point, we were expecting handshake data, * but have application data. If the library was * running inside ssl3_read() (i.e. in_read_app_data * is set) and it makes sense to read application data * at this point (session renegotiation not yet started), * we will indulge it. */ if (s-s3-in_read_app_data (s-s3-total_renegotiations != 0) (( (s-state SSL_ST_CONNECT) (s-state = SSL3_ST_CW_CLNT_HELLO_A) (s-state = SSL3_ST_CR_SRVR_HELLO_A) ) || ( (s-state SSL_ST_ACCEPT) (s-state = SSL3_ST_SW_HELLO_REQ_A) (s-state = SSL3_ST_SR_CLNT_HELLO_A) ) )) { s-s3-in_read_app_data=2; return(-1); } Perhaps the backtracking to reprocess the event as data involves a second blocking socket read() in ssl3_read_bytes(). I am not familiar with the details of this code. What I am curious about is when does this happen. What is it exactly that the server is doing here, why, and is it legal? -- Viktor. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Perhaps the backtracking to reprocess the event as data involves a second blocking socket read() in ssl3_read_bytes(). I am not familiar with the details of this code. What I am curious about is when does this happen. What is it exactly that the server is doing here, why, and is it legal? Did you see the -debug -msg output I posted before? Basically the server sits in select(), when a socket becomes readable it starts a renegotiation int ret = SSL_renegotiate(p_ssl); .. ret = SSL_do_handshake(p_ssl); then immediately reads the data from the socket and writes back a response. The server is built using the same version of OpenSSL (0.9.8a). So the question is whether it is legal for the server to send data while renegotiation is in progress? I don't know... but as far as I can tell I'm not doing anything illegal as an application programmer. I don't think I'm supposed to wait for rehandshake to finish (it doesn't even have to happen as far as I understand). The socket on the server is blocking (don't tell David!) I hope I'm answering your question. I've looked at the ssl3_read_internal() code a couple of times but there is no way I can understand it without spending at least half a day, preferrably with a debugger, and I didn't have time to do that. I can probably build a simple server to reproduce the problem. Should I? __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
So the question is whether it is legal for the server to send data while renegotiation is in progress? The server is not supposed to be required to track whether renegotiation is in progress or not. If you try to send data while a renegotiation is in progress, the OpenSSL library is obligated to do the right thing. That probably means returning an I can't do that until something else happens error. However, it must first try to do whatever it is that needs to be done (that was its job in the first place!) and that may mean blocking if the socket is blocking. I don't know... but as far as I can tell I'm not doing anything illegal as an application programmer. I don't think I'm supposed to wait for rehandshake to finish (it doesn't even have to happen as far as I understand). Well, if you do need to wait for it to finish, the OpenSSL should accurately report that to you, perhaps by returning a WANT_READ if it needs to read data before it can send some. However, it is perfectly legal for it to try the operation before it concludes that it cannot do it. That can block if the socket is blocking. The socket on the server is blocking (don't tell David!) As I've said, operations on blocking sockets can block. The server cannot ensure that it does not block unless it saves the blocking state of the socket, changes it to non-blocking, and then changes it back. All of this would be pointless because you specifically told the library to block -- why should it go out of its way to refuse to do what you asked it to do? I hope I'm answering your question. I've looked at the ssl3_read_internal() code a couple of times but there is no way I can understand it without spending at least half a day, preferrably with a debugger, and I didn't have time to do that. I can probably build a simple server to reproduce the problem. Should I? The situation seems to be very complex. That is where corner cases poke their ugly heads. If you do a blocking 'SSL_write', and it has to read what it knows is protocol data before it can even try writing, what do you think should happen? What if you refuse to call 'SSL_read' because you *know* that there cannot be application data to read because the application protocol requires the other side to send first? (This would make 'SSL_write' *nothing* like a regular TCP write.) I think it's really clear what correct operation is. A blocking 'SSL_read' should block until data can be read, whether that means blocking in 'read' or 'write'. A blocking SSL_write should block until data can be written, whether that means blocking in 'read' or 'write'. Non-blocking operations should, ideally, make as much forward progress as is possible without blocking, whether this means calling 'read', 'write', or both. This will match TCP semantics very well whether the application wishes to block or not. (Though my view of ideal semantics doesn't have much of anything to do with this discussion, except as a wish for what saner people might have done with the benefit of hindsight.) Forcing a client that does a blocking 'SSL_read' to call 'SSL_write' even though it has no application-level data to send strikes me as *really* odd. Without this, how can you ensure a call to 'SSL_read' doesn't block if it gets protocol data instead of negotiation data? Clearly, a blocking 'SSL_read' can never call 'write' and a blocking 'SSL_read' could never call 'read'. This means they cannot block for application data, which is the reason the application asked for blocking behavior. You keep trying to make bricks without straw. It simply cannot be done. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Hello, If a blocking application sets SSL_MODE_AUTO_RETRY, SSL_read() will only return once data is available, or a real error occurs. This must not change. It is not set for s_client. We are taking of these case. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello Your proposition was to add further breakage. It is a mistake to issue a blocking socket operation if you do not wish to block, end of story. This is just a single example of one way this can break and it is impossible to fix it completely without breaking proper blocking applications that really do want to block. My proposition is only clarifying what is already implemented. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, There are a huge number of corner cases I did not address, and it was not my intent to be a 100% complete discussion of the use of SSL_read. We are talking of one and specified (renegotiation) case. Nevertheless, I stand by my analysis of his problem. OK :-) He called SSL_read with a blocking socket even though he did not want to block. SSL_read has no way of knowing that he doesn't want to block because he lied to it by invoking it as a blocking operation. Not HE but OpenSSL developer who created s_client OR OpenSSL developer who created SSL_read(). There will never be any perfectly satisfactory solution to this problem. I agree. SSL_read has no way of knowing whether he really wanted to block until application-level data was available or whether he really didn't want to block -- and short of modifying it to call 'select' before it calls 'read' each time or save the blocking state of the socket set it to non-blocking and then set it back, no conceivable implementation off SSL_read can guarantee that it won't block when called on a blocking socket. OK, but SSL_read() knows that is doing renegotioation, and sometimes SSL_read() returns informations that user should call it again. Proper writted application should be prepared (today) to handle this. If not - may fail now too. So return WANT_READ if renegotiation take place nothing change but solves this problem. I stand by my analysis of his problem because from may point of view library is for user, not user for library :-) Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
There are a huge number of corner cases I did not address, and it was not my intent to be a 100% complete discussion of the use of SSL_read. We are talking of one and specified (renegotiation) case. Nevertheless, I stand by my analysis of his problem. OK :-) He called SSL_read with a blocking socket even though he did not want to block. SSL_read has no way of knowing that he doesn't want to block because he lied to it by invoking it as a blocking operation. Not HE but OpenSSL developer who created s_client OR OpenSSL developer who created SSL_read(). If an OpenSSL developer called SSL_read with a blocking socket in a case where they did not want to block until application-level data was available, then that is an error. There will never be any perfectly satisfactory solution to this problem. I agree. SSL_read has no way of knowing whether he really wanted to block until application-level data was available or whether he really didn't want to block -- and short of modifying it to call 'select' before it calls 'read' each time or save the blocking state of the socket set it to non-blocking and then set it back, no conceivable implementation off SSL_read can guarantee that it won't block when called on a blocking socket. OK, but SSL_read() knows that is doing renegotioation, and sometimes SSL_read() returns informations that user should call it again. Proper writted application should be prepared (today) to handle this. If not - may fail now too. So return WANT_READ if renegotiation take place nothing change but solves this problem. Properly-written applications don't make blocking socket operations when they don't want to block. I stand by my analysis of his problem because from may point of view library is for user, not user for library :-) It is this simple -- making a call on a blocking socket when you are really thinking in your head I must not ever block can *never* be made to work reliably. Trying to make it seem like it might work a bit more often just reinforces an extremely bad habit. You can never cover all the corner cases -- you can never make this work. If you do not wish to block, you *must* set the socket non-blocking. That is *the* mechanism that assures you will not block and it is the *only* mechanism that does so. If SSL_read is supposed to be usable on a blocking socket in any context in which it is supposed to be guaranteed not to block, it *must* set the socket non-blocking itself. There simply is no other way to avoid blocking. Fixing this one case won't fix all the other cases. If you must not block, you must set your sockets non-blocking. I'm sorry, but that's a fact. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
On Sun, Jun 11, 2006 at 01:09:08PM +0200, Marek Marcola wrote: OK, but SSL_read() knows that is doing renegotioation, and sometimes SSL_read() returns informations that user should call it again. Proper writted application should be prepared (today) to handle this. If not - may fail now too. So return WANT_READ if renegotiation take place nothing change but solves this problem. I stand by my analysis of his problem because from may point of view library is for user, not user for library :-) If the underlying BIO is a blocking BIO, SSL_read() must block until it returns some data. Applications that are not prepared to retry, must not be asked to retry. I'm sorry, I just don't see the bug here. It is possible to arrange for SSL_read() to never block, by configuring the underlying BIOs correctly. -- Viktor. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, SSL_read has no way of knowing whether he really wanted to block until application-level data was available or whether he really didn't want to block -- and short of modifying it to call 'select' before it calls 'read' each time or save the blocking state of the socket set it to non-blocking and then set it back, no conceivable implementation off SSL_read can guarantee that it won't block when called on a blocking socket. OK, but SSL_read() knows that is doing renegotioation, and sometimes SSL_read() returns informations that user should call it again. Proper writted application should be prepared (today) to handle this. If not - may fail now too. So return WANT_READ if renegotiation take place nothing change but solves this problem. Properly-written applications don't make blocking socket operations when they don't want to block. Properly-written application have right to use consistent way to use blocking socket - they want to block only on user data - they are not interested what is going on in SSL layer and library should assure that. Renegotiation is special case and (now too) have special treatment in OpenSSL so if you do not agree with that you may fix this in next release. I stand by my analysis of his problem because from may point of view library is for user, not user for library :-) It is this simple -- making a call on a blocking socket when you are really thinking in your head I must not ever block can *never* be made to work reliably. I repeat - application want to block but only on user data, you are not talking of this case. If you do not wish to block, you *must* set the socket non-blocking. That is *the* mechanism that assures you will not block and it is the *only* mechanism that does so. One again - I want to block, but I am not interested on low level SSL stuff If SSL_read is supposed to be usable on a blocking socket in any context in which it is supposed to be guaranteed not to block, it *must* set the socket non-blocking itself. There simply is no other way to avoid blocking. Fixing this one case won't fix all the other cases. If you must not block, you must set your sockets non-blocking. I'm sorry, but that's a fact. I repeat again, application want to block, but only on user data - all SSL related work should be handled internally by library - if SSL work need some time to do something - OK, no problem - if even want to block for some SSL work - OK - after that (like now) gives only hint to call it again. Fixing this removes this dependency. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Properly-written applications don't make blocking socket operations when they don't want to block. Properly-written application have right to use consistent way to use blocking socket - they want to block only on user data - they are not interested what is going on in SSL layer and library should assure that. Renegotiation is special case and (now too) have special treatment in OpenSSL so if you do not agree with that you may fix this in next release. If SSL_read ever blocked when there was application-level data available, you would have a legitimate complaint. I stand by my analysis of his problem because from may point of view library is for user, not user for library :-) It is this simple -- making a call on a blocking socket when you are really thinking in your head I must not ever block can *never* be made to work reliably. I repeat - application want to block but only on user data, you are not talking of this case. That's fine, SSL_read will never block when there is user data. If you do not wish to block, you *must* set the socket non-blocking. That is *the* mechanism that assures you will not block and it is the *only* mechanism that does so. One again - I want to block, but I am not interested on low level SSL stuff That's fine, you are blocking. So you should be happy. you must set your sockets non-blocking. I'm sorry, but that's a fact. I repeat again, application want to block, but only on user data That's fine. I agree that SSL_read on a blocking socket should block until user data is available. - all SSL related work should be handled internally by library - if SSL work need some time to do something - OK, no problem - if even want to block for some SSL work - OK - after that (like now) gives only hint to call it again. Fixing this removes this dependency. I can't follow your logic here. SSL_read on a blocking socket should work like read on a blocking socket, blocking until some data can be returned to the caller. Cases where it returns early (such as on receipt of a signal) are fine, but the caller should expect it to block until data is available and should not call it unless they know from application-level considerations that data will be available. Applications that call a socket function or SSL function thinking I do not want to block *must* set the socket non-blocking. Nothing OpenSSL can do will change that, DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
On Sun, Jun 11, 2006 at 09:51:25PM +0200, Marek Marcola wrote: I repeat - application want to block but only on user data, you are not talking of this case. It looks like you are repeatedly making the same mistake, if one is willing to block for user data, by making a blocking SSL_read() call, one is consequently willing to also block on handshake packets that precede said user data. By making a blocking call one is saying that the application demands remote input at that point in the protocol and is willing to wait for it indefinitely. A socket ready for read condition is not an appropriate indication that it is the peer's turn to send user data. Turn-taking is up to the higher level protocol. If the protocol is not half-duplex, the application must use non-blocking I/O. One must not make blocking SSL_read() calls unless one means it. I repeat again, application want to block, but only on user data - all SSL related work should be handled internally by library - if SSL work need some time to do something - OK, no problem - if even want to block for some SSL work - OK - after that (like now) gives only hint to call it again. Fixing this removes this dependency. Sorry, I believe you are mistaken, you are proposing to fix poorly written non-blocking applications by breaking correctly written blocking applications. This is not acceptable. Please take some time to think through all the use cases beyond the immedeate issue that is motivating this thread. -- Viktor. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, I can't follow your logic here. SSL_read on a blocking socket should work like read on a blocking socket, blocking until some data can be returned to the caller. Cases where it returns early (such as on receipt of a signal) are fine, but the caller should expect it to block until data is available and should not call it unless they know from application-level considerations that data will be available. Look at SSL dump send by author of this thread. Than you will understand what I mean. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Hello, Sorry, I believe you are mistaken, you are proposing to fix poorly written non-blocking applications by breaking correctly written blocking applications. This is not acceptable. When you use blocking socket now, you must react on SSL_ERROR_WANT* any many more - if not - you are doing mistake. Good written application must react on this errors - sooner or later in development process. My proposition was to add to one of this error situation when SSL_read() is doing renegotiation. Look at SSL dump. In what way this break already used applications ??? In what way this break anything ??? Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
When you use blocking socket now, you must react on SSL_ERROR_WANT* any many more - if not - you are doing mistake. If that's true, that's a defect in the implementation. People who use blocking sockets should get blocking behavior. Good written application must react on this errors - sooner or later in development process. Well-written applications should tolerate all kinds of failure scenarios, even the ones that aren't supposed to happen. My proposition was to add to one of this error situation when SSL_read() is doing renegotiation. Look at SSL dump. Your proposition was to add further breakage. It is a mistake to issue a blocking socket operation if you do not wish to block, end of story. This is just a single example of one way this can break and it is impossible to fix it completely without breaking proper blocking applications that really do want to block. In what way this break already used applications ??? In what way this break anything ??? It would have broken the application that this was about. If not for this problem, the author of this application would have released it with a serious defect -- namely that it attempts to perform operations with blocking sockets in a case where it cannot afford to block. Fortunately for the original poster, he was able to detect this problem and can now easily fix it by using non-blocking sockets. (And with luck he won't make this same mistake with TCP, UDP, or any other protocol. It bites more people more often than you might think. It's a common issue on USENET.) When code is broken and easy to fix, anything that makes the breakage harder to detect is bad. Anything that makes it trigger more often and be more easily detected is good. All other things being equal. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
On Mon, Jun 12, 2006 at 12:06:28AM +0200, Marek Marcola wrote: In what way this break already used applications ??? In what way this break anything ??? SSL_read(3): If the underlying BIO is blocking, SSL_read() will only return, once the read operation has been finished or an error occurred, except when a renegotiation take place, in which case a SSL_ERROR_WANT_READ may occur. This behaviour can be controlled with the SSL_MODE_AUTO_RETRY flag of the SSL_CTX_set_mode(3) call. SSL_CTX_set_mode(3): SSL_MODE_AUTO_RETRY Never bother the application with retries if the transport is blocking. If a renegotiation take place during normal operation, a SSL_read(3) or SSL_write(3) would return with -1 and indicate the need to retry with SSL_ERROR_WANT_READ. In a non-blocking environ- ment applications must be prepared to handle incomplete read/write operations. In a blocking environment, applications are not always prepared to deal with read/write operations returning without suc- cess report. The flag SSL_MODE_AUTO_RETRY will cause read/write operations to only return after the handshake and successful com- pletion. If a blocking application sets SSL_MODE_AUTO_RETRY, SSL_read() will only return once data is available, or a real error occurs. This must not change. -- Viktor. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
In what way this break already used applications ??? In what way this break anything ??? [snip] block. Fortunately for the original poster, he was able to detect this problem and can now easily fix it by using non-blocking sockets. (And with luck he won't make this same mistake with TCP, UDP, or any other protocol. It bites more people more often than you might think. It's a common issue on USENET.) This is getting scary. The original poster (that would be me) ran into the problem in s_client, part of OpenSSL. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
In what way this break already used applications ??? In what way this break anything ??? [snip] block. Fortunately for the original poster, he was able to detect this problem and can now easily fix it by using non-blocking sockets. (And with luck he won't make this same mistake with TCP, UDP, or any other protocol. It bites more people more often than you might think. It's a common issue on USENET.) This is getting scary. The original poster (that would be me) ran into the problem in s_client, part of OpenSSL. I guess it comes down to whether the intention was to block until application-level data could be received. If the OpenSSL authors don't know that you *must* set a socket non-blocking to ensure that socket operations don't block, that would be sad. In any event, if there is a bug in s_client, good that we caught it. A lot of people use that code as an example and this might lead them to think that you can 'select' following by blocking sockets operations and still somehow be assured of not blocking. You cannot, and any code based on that assumption is broken. It's hard for me to imagine s_client could be broken that badly. I'll take a look at it. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
I always call SSL_pending() before going into select(), as far as I understand that should be sufficient. Anyways, the server is not hanging in select(), it is definitely inside SSL_read(). Is your socket non-blocking? No, socket is blocking. When I run s_client in non-blocking mode it doesn't get stuck. You can't use 'select' reliably with blocking sockets. Well, it is possible to do so, but it is extremely difficult and can only be done with OpenSSL using bio pairs or some other mechanism where you do all the I/O. Why are your sockets blocking? And if you want to block, what is 'select' doing for you? DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Is your socket non-blocking? No, socket is blocking. When I run s_client in non-blocking mode it doesn't get stuck. You can't use 'select' reliably with blocking sockets. Well, it is possible to do so, but it is extremely difficult and can only be done with OpenSSL using bio pairs or some other mechanism where you do all the I/O. Why are your sockets blocking? And if you want to block, what is 'select' doing for you? Well, we are talking about s_client here... part of openssl executable. select() is used with the blocking sockets to make sure that, well, they don't block. If you call SSL_read on a blocking socket when select says it is readable you expect it not to block [forever]. Of course it might block if there is some data available on the underlying socket but not enough to complete SSL deciphering, but under normal circumstances it will only block until the rest of the record is received. Am I missing something? __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Well, we are talking about s_client here... part of openssl executable. select() is used with the blocking sockets to make sure that, well, they don't block. It doesn't work that way. The only way to ensure that socket operations don't block is to set the sockets non-blocking. If you call SSL_read on a blocking socket when select says it is readable you expect it not to block [forever]. Of course it might block if there is some data available on the underlying socket but not enough to complete SSL deciphering, but under normal circumstances it will only block until the rest of the record is received. Am I missing something? Here's a hypothetical. The 'select' function gives you a 'read' hit. You call SSL_read (thinking there's application-level data, but you don't really know, do you?). SSL_read reads part of a re-negotiation but has no data to return to you, so it calls 'read' again (how does it know it's not supposed to block until it has data?). That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. You are blocked in 'read' but the other side is waiting for you to send protocol-level data. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
I'm watching this thread with great interest as I have not figured out the correct way to handling OpenSSL with non-blocking sockets which are a requirement in my case. Can anyone expand on the correct way to handle OpenSSL over non-blocking sockets please? I haven't been able to find any reliable literature on it yet, even the O'Reilly book is very sketchy on this. Joe David Schwartz wrote: Well, we are talking about s_client here... part of openssl executable. select() is used with the blocking sockets to make sure that, well, they don't block. It doesn't work that way. The only way to ensure that socket operations don't block is to set the sockets non-blocking. If you call SSL_read on a blocking socket when select says it is readable you expect it not to block [forever]. Of course it might block if there is some data available on the underlying socket but not enough to complete SSL deciphering, but under normal circumstances it will only block until the rest of the record is received. Am I missing something? Here's a hypothetical. The 'select' function gives you a 'read' hit. You call SSL_read (thinking there's application-level data, but you don't really know, do you?). SSL_read reads part of a re-negotiation but has no data to return to you, so it calls 'read' again (how does it know it's not supposed to block until it has data?). That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. You are blocked in 'read' but the other side is waiting for you to send protocol-level data. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
The discussion below wherein the term you're screwed is used seems to indicate that there is a deadlock situation, which isn't the case. There may or may not be performance issues associated with the scenario/use-case, but there's no deadlock. R -Original Message- From: [EMAIL PROTECTED] on behalf of David Schwartz Sent: Sat 6/10/2006 1:02 PM To: openssl-users@openssl.org Subject: RE: renegotiating problem - connection hanging? Well, we are talking about s_client here... part of openssl executable. select() is used with the blocking sockets to make sure that, well, they don't block. It doesn't work that way. The only way to ensure that socket operations don't block is to set the sockets non-blocking. If you call SSL_read on a blocking socket when select says it is readable you expect it not to block [forever]. Of course it might block if there is some data available on the underlying socket but not enough to complete SSL deciphering, but under normal circumstances it will only block until the rest of the record is received. Am I missing something? Here's a hypothetical. The 'select' function gives you a 'read' hit. You call SSL_read (thinking there's application-level data, but you don't really know, do you?). SSL_read reads part of a re-negotiation but has no data to return to you, so it calls 'read' again (how does it know it's not supposed to block until it has data?). That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. You are blocked in 'read' but the other side is waiting for you to send protocol-level data. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
The discussion below wherein the term you're screwed is used seems to indicate that there is a deadlock situation, which isn't the case. There may or may not be performance issues associated with the scenario/use-case, but there's no deadlock. Did you look at my logs with s_client? I'm starting to suspect that the correct way to put it is: there is *spposed* to be no deadlock, but there is a bug in SSL_read that can make you screwed. R -Original Message- From: [EMAIL PROTECTED] on behalf of David Schwartz Sent: Sat 6/10/2006 1:02 PM To: openssl-users@openssl.org Subject: RE: renegotiating problem - connection hanging? Well, we are talking about s_client here... part of openssl executable. select() is used with the blocking sockets to make sure that, well, they don't block. It doesn't work that way. The only way to ensure that socket operations don't block is to set the sockets non-blocking. If you call SSL_read on a blocking socket when select says it is readable you expect it not to block [forever]. Of course it might block if there is some data available on the underlying socket but not enough to complete SSL deciphering, but under normal circumstances it will only block until the rest of the record is received. Am I missing something? Here's a hypothetical. The 'select' function gives you a 'read' hit. You call SSL_read (thinking there's application-level data, but you don't really know, do you?). SSL_read reads part of a re-negotiation but has no data to return to you, so it calls 'read' again (how does it know it's not supposed to block until it has data?). That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. You are blocked in 'read' but the other side is waiting for you to send protocol-level data. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
If you call SSL_read on a blocking socket when select says it is readable you expect it not to block [forever]. Of course it might block if there is some data available on the underlying socket but not enough to complete SSL deciphering, but under normal circumstances it will only block until the rest of the record is received. Am I missing something? Here's a hypothetical. The 'select' function gives you a 'read' hit. You call SSL_read (thinking there's application-level data, but you don't really know, do you?). SSL_read reads part of a re-negotiation but has no data to return to you, so it calls 'read' again (how does it know it's not supposed to block until it has data?). That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. You are blocked in 'read' but the other side is waiting for you to send protocol-level data. I'd agree with you if it was not working consistently. But in most cases blocking SSL_read returns helpful WANT_READ. My understanding is that WANT_READ return from SSL_read is especially for avoiding the deadlock I'm running into. Why else would a blocking SSL_read return WANT_READ?? I'm asking it to read data and it comes back and says I can't help you, I'd have to wait for data first. I think what it is trying to say is I can't help you right now because I don't really have data to read even though you probably had reasons to beleive that there was some data available. Either way, there seems to be a bug either in the lib itself or in s_client. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
I'd agree with you if it was not working consistently. It's a race condition. But in most cases blocking SSL_read returns helpful WANT_READ. My understanding is that WANT_READ return from SSL_read is especially for avoiding the deadlock I'm running into. You would be correct if your socket was non-blocking. But because it's blocking, when SSL_read calls 'read', it blocks, possibly forever. The problem is that if you call SSL_read with a blocking socket, it will block until it has some application-level data for you. But you called it when there was no application-level data. Why else would a blocking SSL_read return WANT_READ?? I'm not even sure if this is a blocking SSL_read or not. You called 'select'. You have some horrible hybrid between blocking and non-blocking. I'm asking it to read data and it comes back and says I can't help you, I'd have to wait for data first. I think what it is trying to say is I can't help you right now because I don't really have data to read even though you probably had reasons to beleive that there was some data available. Either way, there seems to be a bug either in the lib itself or in s_client. How can it know what you want it to do when you lie to it? You called a *blocking* SSL_read, whose sole job is to block until it gets some data for you. But there is no data for you. So it blocks, like you asked it do when you called it with a blocking socket. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
The discussion below wherein the term you're screwed is used seems to indicate that there is a deadlock situation, which isn't the case. There may or may not be performance issues associated with the scenario/use-case, but there's no deadlock. R There is a deadlock. You are blocked in 'read' even though there is no data to read. Again, here's the scenario: 1) 'select' indicates that there's data to read, but it's protocol-level data due to a renegotiation, not application-level data. 2) Thinking there's application-level data, you call 'SSL_read'. 3) SSL_read calls 'read', gets the protocol-level data. It calls 'read' again to try to get application-level data. If this occurs on the side that is expected to send application-level data next, you deadlock. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Did you look at my logs with s_client? I'm starting to suspect that the correct way to put it is: there is *spposed* to be no deadlock, but there is a bug in SSL_read that can make you screwed. The bug is not in SSL_read. The bug is in the decision to call SSL_read. There is one important difference between SSL and TCP. A 'select' hit on TCP guarantees (almost) that there is application-level data to read, so you can generally get away with following a 'select' hit for read with a blocking read. However, a 'select' hit on an SSL connection does not in any way guarantee that there is application-level data to read, so it is an error to call a blocking read function based solely on a 'select' hit. For SSL, you can only call a blocking SSL_read if you know from the application-level protocol that it is safe to block until application-level data is received. (Just as would be the case with calling 'read' on a TCP connection *without* a select.) The error is in assuming that read hit from 'select' guarantees that there is application-level data and therefore that it's safe to call an application-level blocking read function. A read hit from 'select' is *NOT* justification for calling a blocking SSL_read function. Only knowledge that it is safe to block until application-level data is received will do. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, Here's a hypothetical. The 'select' function gives you a 'read' hit. You call SSL_read (thinking there's application-level data, but you don't really know, do you?). SSL_read reads part of a re-negotiation but has no data to return to you, so it calls 'read' again (how does it know it's not supposed to block until it has data?). It is very simple - if SSL_read() has to do other work than reading application data records (encrypted user data) like renegotiation it should return WANT_READ. Than upper layer may retry SSL_read() after select(). For me this is SSL_read() problem and may be simply corrected. That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. I do not agree. SSL_read() should be corrected. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
It is very simple - if SSL_read() has to do other work than reading application data records (encrypted user data) like renegotiation it should return WANT_READ. An SSL_read on a blocking socket should block until data can be read, just as a regular 'read' on a TCP connection does. Than upper layer may retry SSL_read() after select(). For me this is SSL_read() problem and may be simply corrected. That makes no sense. Why should the upper layer retry when it already asked for a blocking read? That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. I do not agree. SSL_read() should be corrected. If you call SSL_read, an application-level read function, with a blocking socket, you are asking it to block until it can read application-level data. The error is simple -- for an SSL connection, a read hit from 'select' does not guarantee that application-level data has been received, so it is an error to call an application-level blocking read function just because of this. The change you would suggest would break applications that use a blocking SSL_read correctly, as a way to block until application-level data has been read. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, It is very simple - if SSL_read() has to do other work than reading application data records (encrypted user data) like renegotiation it should return WANT_READ. An SSL_read on a blocking socket should block until data can be read, just as a regular 'read' on a TCP connection does. Than upper layer may retry SSL_read() after select(). For me this is SSL_read() problem and may be simply corrected. That makes no sense. Why should the upper layer retry when it already asked for a blocking read? This make sense and is used in SSL_read() know. When you have blocking socket SSL_read() may return with indication than is should be called again. This is clearly documented on www.openssl.org and in source code. So this mechanism is used now end this in not true that if you call SSL_read() on blocking socket it must return data or critical error. That 'read' blocks forever because there was never any application-level data to read. Sorry, you're screwed. I do not agree. SSL_read() should be corrected. If you call SSL_read, an application-level read function, with a blocking socket, you are asking it to block until it can read application-level data. You are asking it - but (even now) you may get something else end this is no error. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, If you call SSL_read, an application-level read function, with a blocking socket, you are asking it to block until it can read application-level data. Here is information from www.openssl.org: -- If the underlying BIO is blocking, SSL_read() will only return, once -- the read operation has been finished or an error occurred, except -- when a renegotiation take place, in which case a SSL_ERROR_WANT_READ -- may occur. So as you see this is implemented but not exactly :-) Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
On Sat, Jun 10, 2006 at 03:54:18PM -0700, David Schwartz wrote: I do not agree. SSL_read() should be corrected. If you call SSL_read, an application-level read function, with a blocking socket, you are asking it to block until it can read application-level data. The error is simple -- for an SSL connection, a read hit from 'select' does not guarantee that application-level data has been received, so it is an error to call an application-level blocking read function just because of this. The change you would suggest would break applications that use a blocking SSL_read correctly, as a way to block until application-level data has been read. Is nobody in this thread familiar with the BIO interface? bio(3) BIO_new_bio_pair(3) ... One typical use of BIO pairs is to place TLS/SSL I/O under application control, this can be used when the application wishes to use a non standard transport for TLS/SSL or the normal socket routines are inap- propriate. Calls to BIO_read() will read data from the buffer or request a retry if no data is available. Calls to BIO_write() will place data in the buffer or request a retry if the buffer is full. The standard calls BIO_ctrl_pending() and BIO_ctrl_wpending() can be used to determine the amount of pending data in the read or write buffer. ... Both halves of a BIO pair should be freed. That is even if one half is implicit freed due to a BIO_free_all() or SSL_free() call the other half needs to be freed. When used in bidirectional applications (such as TLS/SSL) care should be taken to flush any data in the write buffer. This can be done by calling BIO_pending() on the other half of the pair and, if any data is pending, reading it and sending it to the underlying transport. This must be done before any normal processing (such as calling select() ) due to a request and BIO_should_read() being true. ... SSL_set_bio(3) SSL_read(3) If you create a bio_pair, and instruct SSL via SSL_set_bio to use the bio pair, SSL_read never directly reads the network, if it wants more data and the read BIO is empty it tells you, and if its write BIO is full, it likewise tells you. It is up to the application to fill the and drain the bios from/to the socket. -- Viktor. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, An SSL_read on a blocking socket should block until data can be read, just as a regular 'read' on a TCP connection does. Even in regular read() from blocking socket there may be situation when -1 is returned but no critical error occur and you should simply retry read() - when EINTR occured - sometimes happens in debbuger. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
Hello, If you call SSL_read, an application-level read function, with a blocking socket, you are asking it to block until it can read application-level data. Here is information from www.openssl.org: -- If the underlying BIO is blocking, SSL_read() will only return, once -- the read operation has been finished or an error occurred, except -- when a renegotiation take place, in which case a SSL_ERROR_WANT_READ -- may occur. So as you see this is implemented but not exactly :-) There are a huge number of corner cases I did not address, and it was not my intent to be a 100% complete discussion of the use of SSL_read. Nevertheless, I stand by my analysis of his problem. He called SSL_read with a blocking socket even though he did not want to block. SSL_read has no way of knowing that he doesn't want to block because he lied to it by invoking it as a blocking operation. There will never be any perfectly satisfactory solution to this problem. SSL_read has no way of knowing whether he really wanted to block until application-level data was available or whether he really didn't want to block -- and short of modifying it to call 'select' before it calls 'read' each time or save the blocking state of the socket set it to non-blocking and then set it back, no conceivable implementation off SSL_read can guarantee that it won't block when called on a blocking socket. The short answer is, if you do not want your socket operations to block, you *MUST* set the sockets non-blocking. Otherwise, they can block in a large variety of corner cases causing your program to fail in unpredictable and sometimes hard to reproduce ways. This problem is not unique to OpenSSL or even protocol layered on top of TCP. It is an error to call an application-level blocking read operation if you do not want to block until application-level data is available. It is an error to assume that application-level data must be available just because 'select' gives you a read hit. You can only be certain that application-level data will be available if the application-level protocol ensures it. (For example, if it's SSL over SMTP and you just sent a complete command that the SMTP protocol ensure specifies requires a reply.) This is one of a whole host of similar bugs caused by thinking that blocking sockets can be used in a non-blocking way by using status functions. There are a variety of reasons this doesn't work. One of them is that things can change after the status function returns but before you complete the socket operation. Another (the one that bites here) is that the status function may not test precisely the same thing the blocking function waits for. Unless you are guaranteed that the status function tests precisely the same thing the blocking function waits for and you are further guaranteed nothing can change in-between when the status function returns and the blocking operation commences, you are *not* guaranteed that a blocking operation will not block and it is *error* to assume so. DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Hello, Would appreciate any advice on how to procede with debugging this. As usual my suggestion is to add -msg -debug options to get more information from openssl s_client. On server you may check auto-retry option: SSL_CTX_set_mode(ctx, SSL_MODE_AUTO_RETRY); this may help if not correctly support return codes from SSL read/write functions. Remember that data is buffered in SSL layer, so sometimes when you use select() on filedescriptor you may wait for client data (that is already in local SSL buffer) and client will wait for server response - and connection looks hang. Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Would appreciate any advice on how to procede with debugging this. As usual my suggestion is to add -msg -debug options to get more information from openssl s_client. I get a bunch of binary data displayed but it seems to stop on the same line: SSL_connect:SSLv3 read finished A I'm now also running my own client and it doesn't seem to have the problem so I'm starting to suspect (well, rather hope) that this might be an issue with s_client... On server you may check auto-retry option: SSL_CTX_set_mode(ctx, SSL_MODE_AUTO_RETRY); this may help if not correctly support return codes from SSL read/write functions. I think I'm always handling WANT_* returns. Remember that data is buffered in SSL layer, so sometimes when you use select() on filedescriptor you may wait for client data (that is already in local SSL buffer) and client will wait for server response - and connection looks hang. I always call SSL_pending() before going into select(), as far as I understand that should be sufficient. Anyways, the server is not hanging in select(), it is definitely inside SSL_read(). __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Hello, I think I'm always handling WANT_* returns. I always call SSL_pending() before going into select(), as far as I understand that should be sufficient. Anyways, the server is not hanging in select(), it is definitely inside SSL_read(). Ok, just checking :-) Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
I always call SSL_pending() before going into select(), as far as I understand that should be sufficient. Anyways, the server is not hanging in select(), it is definitely inside SSL_read(). Ok, just checking :-) I think there is a bug in the library... I've added some debug printouts to s_client and here is what I get: calling SSL_write after SSL_write: write 6 bytes, 0 select returned 1 fd, read: 1, write 0 calling SSL_read SSL_connect:SSL renegotiate ciphers SSL_connect:SSLv3 write client hello A SSL_connect:error in SSLv3 read server hello A after SSL_read: 9 bytes, 0 select returned 1 fd, read: 1, write 0 calling SSL_read SSL_connect:SSLv3 read server hello A ... verify error:num=20:unable to get local issuer certificate verify return:1 ... verify error:num=27:certificate not trusted verify return:1 ... verify error:num=21:unable to verify the first certificate verify return:1 SSL_connect:SSLv3 read server certificate A SSL_connect:SSLv3 read server key exchange A SSL_connect:SSLv3 read server done A SSL_connect:SSLv3 write client key exchange A SSL_connect:SSLv3 write change cipher spec A SSL_connect:SSLv3 write finished A SSL_connect:SSLv3 flush data SSL_connect:SSLv3 read finished A and it is stuck. So we call blocking SSL_read() based on select(), but select saw data that was part of the renegotiating process, so SSL_read() has nothign to return and it hangs. It should be returning WANT_READ, but I think it is getting confused because we have two calls to SSL_read and the second one does not realize that there is renegotiation going on. Contrast with the log of the working scenario (notice there is only one call of SSL_read() this time): calling SSL_write after SSL_write: 7 bytes, 0 select returned 1 fd, read: 1, write 0 calling SSL_read SSL_connect:SSL renegotiate ciphers SSL_connect:SSLv3 write client hello A SSL_connect:SSLv3 read server hello A ... verify error:num=20:unable to get local issuer c verify return:1 ... verify error:num=27:certificate not trusted verify return:1 ... verify error:num=21:unable to verify the first c verify return:1 SSL_connect:SSLv3 read server certificate A SSL_connect:SSLv3 read server key exchange A SSL_connect:SSLv3 read server done A SSL_connect:SSLv3 write client key exchange A SSL_connect:SSLv3 write change cipher spec A SSL_connect:SSLv3 write finished A SSL_connect:SSLv3 flush data SSL_connect:SSLv3 read finished A after SSL_read: -1, 2 read R BLOCK You see, this time our SSL_read() blocks until the renegotiation is complete and then returns WANT_READ. So I don't see a safe way of calling blocking SSL_read() knowing that it will for sure have something to return. Bug? __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
I always call SSL_pending() before going into select(), as far as I understand that should be sufficient. Anyways, the server is not hanging in select(), it is definitely inside SSL_read(). Is your socket non-blocking? DS __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
Hello, I think there is a bug in the library... I've added some debug printouts to s_client and here is what I get: calling SSL_write after SSL_write: write 6 bytes, 0 select returned 1 fd, read: 1, write 0 calling SSL_read SSL_connect:SSL renegotiate ciphers SSL_connect:SSLv3 write client hello A SSL_connect:error in SSLv3 read server hello A after SSL_read: 9 bytes, 0 Interesting what is sent from server to client in this stage ... can you send -msg -state -debug output to this point ? Best regards, -- Marek Marcola [EMAIL PROTECTED] __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
RE: renegotiating problem - connection hanging?
I always call SSL_pending() before going into select(), as far as I understand that should be sufficient. Anyways, the server is not hanging in select(), it is definitely inside SSL_read(). Is your socket non-blocking? No, socket is blocking. When I run s_client in non-blocking mode it doesn't get stuck. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]
Re: renegotiating problem - connection hanging?
calling SSL_write after SSL_write: write 6 bytes, 0 select returned 1 fd, read: 1, write 0 calling SSL_read SSL_connect:SSL renegotiate ciphers SSL_connect:SSLv3 write client hello A SSL_connect:error in SSLv3 read server hello A after SSL_read: 9 bytes, 0 Interesting what is sent from server to client in this stage ... can you send -msg -state -debug output to this point ? after SSL_write: 6, 0 select returned 1, read: 1, write 0 calling SSL_read because: [0, 1] read from 0xa90df0 [0xa974b0] (5 bytes = 5 (0x5)) - 16 03 .. 0005 - SPACES/NULS read from 0xa90df0 [0xa974b5] (32 bytes = 32 (0x20)) - 75 da fc 50 99 84 dd 44-86 95 b4 30 89 0a 75 6d u..P...D...0..um 0010 - 8c 08 03 f6 8f 10 95 25-4b 71 8c ab 5a 71 db 71 ...%Kq..Zq.q SSL 3.0 Handshake [length 0004], HelloRequest 00 00 00 00 SSL_connect:SSL renegotiate ciphers write to 0xa90df0 [0xa9bce8] (133 bytes = 133 (0x85)) - 16 03 00 00 80 bb d3 19-73 b0 3a 9b 04 7d c6 b9 s.:..}.. 0010 - 17 d2 8e 0a 52 3c aa 76-fb ec 11 b9 cd 19 6c fd R.v..l. 0020 - 35 10 15 85 5b ab af bd-ec 82 15 d3 fb c9 90 27 5...[..' 0030 - 2e 73 4a 41 d9 4b 64 28-c4 ab f0 95 28 7c a9 bd .sJA.Kd((|.. 0040 - 5d cb 23 5c 2f a2 6c a7-55 a1 52 e5 25 ae da 85 ].#\/.l.U.R.%... 0050 - 83 5d 67 73 ee 3d 9d 9a-61 0c bf 81 6c 02 62 74 .]gs.=..a...l.bt 0060 - de 31 a6 bb 63 d4 b0 e3-99 1c 77 c8 49 cb f1 5f .1..c.w.I.._ 0070 - 40 f5 bc c6 96 59 1e 06-8e 65 59 0f 1a ab 5e f1 @Y...eY...^. 0080 - 85 33 d1 3b fe.3.;. SSL 3.0 Handshake [length 0061], ClientHello 01 00 00 5d 03 00 44 8a 2f 89 c6 bd ab 01 7e 5a 4d 08 6e d3 93 d9 27 31 2e d9 18 61 5b 0c eb 0f 6f 32 00 89 9f 1b 00 00 36 00 39 00 38 00 35 00 16 00 13 00 0a 00 33 00 32 00 2f 00 07 00 66 00 05 00 04 00 63 00 62 00 61 00 15 00 12 00 09 00 65 00 64 00 60 00 14 00 11 00 08 00 06 00 03 01 00 SSL_connect:SSLv3 write client hello A read from 0xa90df0 [0xa974b0] (5 bytes = 5 (0x5)) - 17 03 .. 0005 - SPACES/NULS read from 0xa90df0 [0xa974b5] (32 bytes = 32 (0x20)) - 4c 0e be 0c d4 b3 4c 1f-fa 4f 08 5f 76 86 c4 24 L.L..O._v..$ 0010 - ba 03 82 64 8c f2 6b 28-28 fd 27 c0 f3 b7 c2 66 ...d..k((.'f read from 0xa90df0 [0xa974b0] (5 bytes = 5 (0x5)) - 17 03 .. 0005 - SPACES/NULS read from 0xa90df0 [0xa974b5] (32 bytes = 32 (0x20)) - c4 ce 74 2f 95 90 da 13-0b 79 8c c3 0f 48 b4 66 ..t/.y...H.f 0010 - 58 0d 69 52 70 50 b8 9b-c3 40 6e eb 26 ed f2 a3 [EMAIL PROTECTED]... SSL_connect:error in SSLv3 read server hello A after SSL_read: 9, 0 select returned 1, read: 1, write 0 calling SSL_read because: [0, 1] read from 0xa90df0 [0xa974b0] (5 bytes = 5 (0x5)) - 16 03 .. 0005 - SPACES/NULS read from 0xa90df0 [0xa974b5] (32 bytes = 32 (0x20)) - 19 2f fb fb 02 2b c9 a0-3f 28 84 23 6e 54 15 3a ./...+..?(.#nT.: 0010 - 52 e4 06 2e 53 08 83 7f-ff 36 70 c0 18 f4 d0 7b R...S6p{ SSL 3.0 Handshake [length 0004], HelloRequest 00 00 00 00 read from 0xa90df0 [0xa974b0] (5 bytes = 5 (0x5)) - 16 03 .. 0005 - SPACES/NULS read from 0xa90df0 [0xa974b5] (32 bytes = 32 (0x20)) - 7b 01 84 a3 c3 db fe 46-c4 f5 d7 f2 e5 dc 03 5a {..F...Z 0010 - b1 f4 23 ba 70 64 ef ae-e0 cf 41 64 75 86 dc dc ..#.pdAdu... SSL 3.0 Handshake [length 0004], HelloRequest 00 00 00 00 read from 0xa90df0 [0xa974b0] (5 bytes = 5 (0x5)) - 16 03 00 00 60` read from 0xa90df0 [0xa974b5] (96 bytes = 96 (0x60)) - c8 12 6c 83 1e 55 fd bc-2d 30 6d 2d 20 2d 4a 46 ..l..U..-0m- -JF 0010 - 44 e7 a9 0b 59 c2 f2 a2-74 ec f5 c6 01 49 56 13 D...Y...tIV. 0020 - b6 f4 f2 e4 5b 3d 0f ce-d0 80 60 5b 8e 28 5c 7e [=`[.(\~ 0030 - 62 f9 f8 a1 10 9c a0 6c-b3 ce 26 da 6b a3 85 d9 b..l...k... 0040 - c7 e4 9c c0 18 5a f8 c8-36 d7 09 bb 7f 87 4c 13 .Z..6.L. 0050 - 16 f3 6c 65 c2 59 a4 69-5a c7 b1 86 65 87 37 70 ..le.Y.iZ...e.7p SSL 3.0 Handshake [length 004a], ServerHello 02 00 00 46 03 00 44 8a 2f 89 dc 92 a1 54 78 f1 8b 66 98 44 2e 27 37 06 c9 bd 21 13 6b 6f bf 3f d4 73 22 29 14 4c 20 ed 03 fd 96 c6 38 58 1a 3b db 10 f3 de 78 ef a9 ad 64 b3 82 4a b8 b3 10 ec 08 30 61 74 d8 09 fe 00 39 00 SSL_connect:SSLv3 read server hello A read from 0xa90df0 [0xa974b0] (5 bytes = 5 (0x5)) - 16 03 00 03 d0. __ OpenSSL Project http://www.openssl.org User Support Mailing Listopenssl-users@openssl.org Automated List Manager [EMAIL PROTECTED]