Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-17 Thread counterpoint
Although maybe the simple answer is to read into a temporary 32 KB buffer and
then malloc and copy.



--
View this message in context: 
http://openssl.6102.n7.nabble.com/Find-size-of-available-data-prior-to-ssl-read-tp61722p61734.html
Sent from the OpenSSL - User mailing list archive at Nabble.com.
___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-17 Thread counterpoint
Thanks to Michael and Kurt for explanatory comments.

Is there an available setting that gives the upper limit on the amount of
data that will be obtained by a single ssl_read()?

The data stream is SQL requests, and often these are quite small, but they
can run to megabytes. I need to malloc a buffer for the data. If it is too
small, that will impose extra processing overheads in the rest of the
system. If it is too large, it will impose memory wastage on the rest of the
system.  The system has an upper limit of 32 KB on the initial size of a
buffer for reading, but that is way above the typical SQL request.

So, accepting that I can't set the size precisely, if there is a limit for
SSL data reads that is significantly lower than 32 KB then that might be a
feasible fixed buffer size.  If that isn't possible, maybe it will have to
be a tunable configuration value.  Any comments?



--
View this message in context: 
http://openssl.6102.n7.nabble.com/Find-size-of-available-data-prior-to-ssl-read-tp61722p61733.html
Sent from the OpenSSL - User mailing list archive at Nabble.com.
___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-17 Thread Michael Wojcik
> From: openssl-users [mailto:openssl-users-boun...@openssl.org] On Behalf
> Of counterpoint
> Sent: Thursday, December 17, 2015 04:51
> 
> Although maybe the simple answer is to read into a temporary 32 KB buffer and
> then malloc and copy.

That, more or less, was my recommendation in my previous post.

The optimal size of the temporary buffer depends on factors we don't know. If 
most of your messages fit in 32KB, then that may save you extra calls to 
SSL_read. On the other hand, it could mean excessive copying - it might be 
better to use a smaller buffer to reduce the size of the additional copy 
operation, even at the cost of an extra call to SSL_read. (Obviously some 
copying is happening in the SSL/TLS processing anyway, and the cost of such 
copying is small relative to the cost of decryption and other compute-intensive 
operations. But if your application deals with a high transaction rate then 
cutting down that extra copy may be worthwhile anyway.)

If your application is single-threaded, you can make that a static buffer; if 
not, it needs to go on the stack, which could be a problem if your threads are 
stack-constrained. That's another argument (if it applies to your case) for 
using a smaller initial buffer.

If the first chunk of your message tells you how large the entire message will 
be, then this approach means only one call to the allocator per message 
received, which is good. And it means the same code path for every message 
regardless of size, which is good for program correctness and maintainability.

Based on what you've told us, this is the approach I'd recommend. The only 
question is the size of that initial buffer, and you're in a better position to 
determine that.

-- 
Michael Wojcik
Technology Specialist, Micro Focus


___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-17 Thread Jakob Bohm

On 17/12/2015 10:36, counterpoint wrote:

Thanks to Michael and Kurt for explanatory comments.

Is there an available setting that gives the upper limit on the amount of
data that will be obtained by a single ssl_read()?

The data stream is SQL requests, and often these are quite small, but they
can run to megabytes. I need to malloc a buffer for the data. If it is too
small, that will impose extra processing overheads in the rest of the
system. If it is too large, it will impose memory wastage on the rest of the
system.  The system has an upper limit of 32 KB on the initial size of a
buffer for reading, but that is way above the typical SQL request.

So, accepting that I can't set the size precisely, if there is a limit for
SSL data reads that is significantly lower than 32 KB then that might be a
feasible fixed buffer size.  If that isn't possible, maybe it will have to
be a tunable configuration value.  Any comments?

The current SSL/TLS standards limits the per record data
size to 16K exactly, see for example RFC5246 section 6.2.1.

However the data you want in your (higher level) code
probably has a completely different natural size
limit/unit which may be larger and smaller.  For SQL there
is no natural limit however, unless your SQL parser
happens to fail on statements above some arbitrary size.


Enjoy and Merry Christmas

Jakob
--
Jakob Bohm, CIO, Partner, WiseMo A/S.  https://www.wisemo.com
Transformervej 29, 2860 Søborg, Denmark.  Direct +45 31 13 16 10
This public discussion message is non-binding and may contain errors.
WiseMo - Remote Service Management for PCs, Phones and Embedded

___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-17 Thread counterpoint
Thanks, that makes sense. My ability to optimise is constrained - the system
is a product so I do not know what the actual pattern of usage will be. But
there is a limit on buffer size within the system. It's a defined symbol, so
can be altered from the default of 32 KB, but only by recompiling the
system. I rely on a working assumption that people who change definitions
and recompile know what they're doing.

The system is threaded, but it is designed to operate with a relatively
small number of highly active threads, so grabbing 32 KB on the stack for a
short period shouldn't be too much of an issue. It would be much harder to
figure out the actual message size because the calls to SSL are taking place
in a generic core, whereas the protocol is in a different layer of code.
There are ways it could be done, but I'm inclined to leave that for a future
optimisation.

That leaves me feeling that the fixed buffer on the stack is the cleanest
solution, involving simple code. The copying overhead is there, but looks
hard to eliminate, and as you say there is plenty of other overhead. I'm not
sure that the small initial buffer offers me much gain, although it might
help in some situations. (Personally I'm inclined to use SSH tunnels rather
than SSL for SQL traffic, but that's another story!).

One remaining point leaves me uncertain. Supposing an SSL write gets the
response SSL_ERROR_WANT_READ. Then there is a POLLIN event. I take it the
first thing that must happen is a retry of the write. Assuming that works,
do I need to assume that there could be data to be read?  Or will a further
event occur, so that I should return to looking out for events?  I guess the
answer to the last question is probably no, but am unsure.





--
View this message in context: 
http://openssl.6102.n7.nabble.com/Find-size-of-available-data-prior-to-ssl-read-tp61722p61741.html
Sent from the OpenSSL - User mailing list archive at Nabble.com.
___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-17 Thread Michael Wojcik
> From: openssl-users [mailto:openssl-users-boun...@openssl.org] On Behalf
> Of counterpoint
> Sent: Thursday, December 17, 2015 11:35
> 
> Thanks, that makes sense. My ability to optimise is constrained - the system
> is a product so I do not know what the actual pattern of usage will be. But
> there is a limit on buffer size within the system. It's a defined symbol, so
> can be altered from the default of 32 KB, but only by recompiling the
> system. I rely on a working assumption that people who change definitions
> and recompile know what they're doing.

Fair enough.

> The system is threaded, but it is designed to operate with a relatively
> small number of highly active threads, so grabbing 32 KB on the stack for a
> short period shouldn't be too much of an issue.

It's not really a matter of how many threads there are (except indirectly), or 
of how long the item is on the stack. It's a question of how much space is 
available on the thread's stack when you try to allocate the buffer (which, 
assuming we're talking C or C++, is when you enter the function / method).

A thread's stack size is typically set at creation time, with a default that 
may be fixed in the threading implementation or set at link time. How much 
space is available when you allocate that 32 KB buffer depends on how deep your 
call chain is and how much data each of those frames adds to the stack.

If the stack is too small to accommodate the buffer and can't be expanded, 
you'll get some kind of run-time failure, like a Windows exception or a UNIX 
signal.

Note that stack space is an address-space resource, not (generally) a virtual 
memory one - that is, stack-space is unlikely to be constrained because the 
system is running short on virtual memory. It'll happen because most language 
implementations use contiguous stacks for performance (rather than, say, 
displays or other non-contiguous structures), and if the stack runs into 
something else in the process address space, it can't grow any further. So if 
your process is 64-bit, you should be able to specify ridiculously large thread 
stacks and not worry about it.

If the process is 32-bit, take a look at your thread stack sizes and do a quick 
estimate on how much space you expect will be there. You can determine this for 
a specific thread, in a specific run, in a debugger by looking at the address 
of an automatic variable at the bottom of the thread's stack (in the thread's 
initial function) and the address of one in your data-receiving function. 
(Technically comparing those addresses isn't authorized by the language 
standard, but it's valid on most of the platforms OpenSSL supports.)

So I'd say try it in some test runs and see if it looks like stack space might 
be getting tight; if so, you can likely increase the stack size you specify 
when creating your threads, since you don't have many of them.

> One remaining point leaves me uncertain. Supposing an SSL write gets the
> response SSL_ERROR_WANT_READ. Then there is a POLLIN event. I take it
> the
> first thing that must happen is a retry of the write. Assuming that works,
> do I need to assume that there could be data to be read?  Or will a further
> event occur, so that I should return to looking out for events?  I guess the
> answer to the last question is probably no, but am unsure.

There could be data to be read. Consider this scenario:

1. The peer decides it wants to renegotiate during the conversation.
2. In the middle of the handshake, you call SSL_write. The handshake hasn't 
completed, and the local side is waiting for a message from the peer, so 
SSL_write returns SSL_ERROR_WANT_READ.
3. You wait for POLLIN, then call SSL_write again.
4. Before SSL_write returns, the peer has time to respond to the request you 
just sent. Or it sends something else immediately after completing the 
handshake, if your application doesn't use a strict switched-duplex 
request-response protocol.

So I'd recommend going ahead and trying a non-blocking SSL_read at that point. 
The overhead is tiny and you won't miss any inbound-data events.

-- 
Michael Wojcik
Technology Specialist, Micro Focus


___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-17 Thread counterpoint
Thanks, very helpful. We only support 64 bit.



--
View this message in context: 
http://openssl.6102.n7.nabble.com/Find-size-of-available-data-prior-to-ssl-read-tp61722p61746.html
Sent from the OpenSSL - User mailing list archive at Nabble.com.
___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-16 Thread Michael Wojcik
> From: openssl-users [mailto:openssl-users-boun...@openssl.org] On Behalf
> Of Martin Brampton
> Sent: Wednesday, December 16, 2015 13:23
> 
> Is there a way to obtain the amount of data available to be read?
> 
> I'm working with a system that operates in non-blocking mode using
> epoll. When an EPOLLIN event is received the aim is to read the data.
> For the non-SSL case, the amount of data can be obtained using ioctl
> FIONREAD.  This is used to malloc a suitable sized buffer, followed by
> read the data into the buffer.
> 
> How should the SSL version of our code work?  At present it is using the
> sum of the number obtained from ioctl FIONREAD (which seems suspect
> when
> SSL is in use and appears to be always too large) and the number from
> ssl_pending (which seems to be zero).  The buffer then has to be truncated.

TCP is a stream service. It may deliver (to the application, which in this case 
means to OpenSSL) part of an SSL/TLS record, a single complete record, multiple 
records...

In some situations, you may reliably receive one TLS record at a time. You 
can't assume that will be the general case, particularly for application 
protocols that aren't simple alternating request-response pairs, or over long 
network paths, or with large blocks of application data, or if the recipient's 
stack is squeezed for resources.

FIONREAD will show the amount of data available from the stack. SSL_pending 
will show the amount of application data from complete records OpenSSL has 
already received and processed that the application has not read from OpenSSL 
yet. Per above, the former can represent less than one record to multiple 
records and possibly a partial one at the end. The latter may well not be zero, 
for example if the peer does multiple sends, or sends a block of data large 
enough that it gets chunked into multiple TLS records; then OpenSSL may read 
data from the stack and get multiple complete records, in which case 
SSL_pending will be > 0.

Note that nothing in the OpenSSL API gives you the number of bytes of a partial 
record that OpenSSL has received from the stack.

Even in the ideal case where exactly a single TLS record is sitting in the 
stack's buffers, FIONREAD will be larger than the size of the application data, 
because it's a TLS record, which has non-zero overhead. Specifically it has a 
header containing type, version, and length, and a footer with MAC and padding. 
The application only gets the application data, so it must get fewer than 
FIONREAD bytes.

Unless I'm forgetting something, since Open SSL will only deliver application 
data to the caller, and only from a complete record, then:

- If, when the application obtains the values from FIONREAD + ssl_pending (call 
this sum N), at least one complete TLS record has been received by the stack 
and not read by the application, then the amount of data the application gets 
from SSL_read will be strictly less than N
- Otherwise, in the case where the application gets those values too early, N 
will be less than the size of the record OpenSSL will eventually assemble, the 
amount of application data *may* be greater than, equal to (unlikely), or less 
than N. In this case there's simply no way for the application to know.

> Can this approach work?

No. OpenSSL doesn't know how much data is in a TLS record until it's processed 
it, and it doesn't know that until it has the complete record. (It could assume 
the record is valid before it has the complete record and look at the length 
field, but it doesn't know how long the padding is until it has the very last 
byte. And assuming the record is valid is a Bad Idea.)

Consequently, your application can't know that either.

Looking at the amount of data buffered by the stack is pointless, for the 
reasons discussed above.

>  Could it be improved?  Or is there some
> fundamental problem with operating in this way?

The fundamental problem is that you don't know how much data is going to be 
available from whatever complete records OpenSSL has received, and you don't 
even know that OpenSSL has received a complete record. The sender could be 
dribbling data to you one byte at a time. (This would be perverse, but what if 
some MITM is mucking about with your window announcements? Note those are at 
the TCP protocol level and so are not protected by TLS.)

You might want to look at something like this:

- Use non-blocking sockets. When you get a POLLIN event, try SSL_read with a 
small fixed buffer. If it returns SSL_WANT_READ, you don't have a complete 
record yet.
- Set the read-ahead flag with SSL_CTX_set_read_ahead (before creating your SSL 
objects), so that OpenSSL will grab all available data off the wire when you 
call SSL_read; that will reduce useless POLLIN events.
- When you have a successful SSL_read, use SSL_pending to get the number of 
application-data bytes remaining. Allocate a buffer of fixed-small-buffer-size 
+ value-from-SSL_pending. Copy in the small fixed 

Re: [openssl-users] Find size of available data prior to ssl_read

2015-12-16 Thread Kurt Roeckx
On Wed, Dec 16, 2015 at 06:23:25PM +, Martin Brampton wrote:
> Is there a way to obtain the amount of data available to be read?
> 
> I'm working with a system that operates in non-blocking mode using epoll.
> When an EPOLLIN event is received the aim is to read the data. For the
> non-SSL case, the amount of data can be obtained using ioctl FIONREAD.  This
> is used to malloc a suitable sized buffer, followed by read the data into
> the buffer.
> 
> How should the SSL version of our code work?  At present it is using the sum
> of the number obtained from ioctl FIONREAD (which seems suspect when SSL is
> in use and appears to be always too large) and the number from ssl_pending
> (which seems to be zero).  The buffer then has to be truncated.

Please note that SSL_pending() returns the data about already
processed / decrypted TLS records.  If the record is not complete
it's not processed and we won't tell how big it is.  This means
that it's possible for SSL_pending() to return 0 and that
receiving a single byte for the kernel might make the whole packet
available.

If you then go and only read 1 byte, calling SSL_pending() will
actually tell you how many other bytes are still has available for
you that already passed all the checks.

So the library can have unprocessed bytes from a TLS record in
it's internal buffer, but it's not going to tell you much about
it.

SSL / TLS also has overhead, the data might also not even be
application data.  Also, some ciphers work in blocks so there
might be added padding for those blocks.  So there are various
reasons why you might receive less data too.

If you always call SSL_read() on the boundaries of the records
you'll always get less data, but there is really no way for you to
see that.  It might be that in your application this is always
what happens, but I wouldn't rely on it.

If you don't call in on the boundaries there is little
you can predict about the size you're going to get.


Kurt

___
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users