Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Hanno Böck
Hello,

On Tue, 26 Apr 2016 16:58:48 +
Michael Wojcik  wrote:

> But, again, this is just a performance and efficiency hit - it won't
> break anything - and if it's on the Apache side, there probably isn't
> much you can do about it. Maybe it's tunable in the Apache
> configuration but it seems like an odd thing to make configurable,
> and even odder to make wrong by default.

First of all: Before you continue speculating, my server is not doing
anything secret, just connect to it :-) (the one behind hboeck.de)

It's definitely chunking, if I manually connect via openssl s_client I
can see.

The reason is (as Rainer pointed out in a private mail) server side
includes used in the error pages. So it seems Apache's server side
includes implementation causes lots of small chunks.

This essentially means my error pages are serverd horribly inefficient.
However I think that doesn't matter too much, as they should only be
served on errors and errors should be hopefully scarce. This does not
happen with static content. Also with PHP content I still get chunked
encoding, but not these many small chunks.

I think we're getting pretty far away from openssl, so I hope nobody is
annoyed by offtopic discussion (and I think we can close it here), just
as people were speculating and it seemed to have generated quite
some interest I wanted to give a final answer what the cause was.

-- 
Hanno Böck
https://hboeck.de/

mail/jabber: ha...@hboeck.de
GPG: BBB51E42


pgpiFlypNIuaH.pgp
Description: OpenPGP digital signature
-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Michael Wojcik
> From: Michael Wojcik
> Sent: Tuesday, April 26, 2016 12:39
> To: openssl-users@openssl.org
> Subject: RE: [openssl-users] BIO_read hangs, how can I know if the server
> wants to send data?
> 
> Ugh. Apache is doing the Wrong Thing. It's sending data as it generates it,
> instead of coalescing. Those two-octet messages are almost certainly the CR
> LF pairs that terminate lines of the HTTP header.

Rainer Jung has pointed out this is may well be a chunked message body. I was 
thinking we were still looking at the HTTP header here.

If it's a chunked message body, that's more understandable, but it's still the 
wrong thing for Apache to be doing. With TCP you always, always, always want to 
present all the associated outbound data to the stack at once, to avoid the 
overhead of sending small packets and Nagle / Delayed ACK. It appears that 
Apache is writing the chunk header and then the chunk data, which is precisely 
what it should NOT do.

Though actually that said I don't think this is chunked TE, because the chunk 
header must be at least 3 octets: at least one hex digit, CR, and LF.

But, again, this is just a performance and efficiency hit - it won't break 
anything - and if it's on the Apache side, there probably isn't much you can do 
about it. Maybe it's tunable in the Apache configuration but it seems like an 
odd thing to make configurable, and even odder to make wrong by default.

-- 
Michael Wojcik
Technology Specialist, Micro Focus


-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Michael Wojcik
> From: Hanno Böck [mailto:ha...@hboeck.de]
> Sent: Tuesday, April 26, 2016 12:13
> To: Michael Wojcik
> Cc: openssl-users@openssl.org
> Subject: Re: [openssl-users] BIO_read hangs, how can I know if the server
> wants to send data?
> 
> Thanks for both your answers, that was very helpful (although it
> probably means what I'm trying to do is more complicated than I
> thought)...

My suggestion: Use an HTTP library such as libcurl. libcurl already supports 
integration with OpenSSL. Don't reinvent the HTTP wheel.

> One more question you might be able to answer:
> When I run my test code and connect to google.com I get the following
> bytes read for each BIO_read call:
> 1024
> 365
> 289

So Google is sending three TLS records with application data. That means it's 
doing the Right Thing: coalescing outbound data into a few messages. Without a 
wire trace, we can only guess what those three are.

> When I run these against my own server (relatively standard
> apache2.4+openssl setup) I get very different numbers:
> 240
> 287
> 2
> 588
> 2
> 41
> 2

Ugh. Apache is doing the Wrong Thing. It's sending data as it generates it, 
instead of coalescing. Those two-octet messages are almost certainly the CR LF 
pairs that terminate lines of the HTTP header.

Why is this wrong? Because sending short TCP packets is inefficient, and can 
trigger Nagle / Delayed ACK Interaction, which rate-limits the traffic. An 
application can prevent NDAI by disabling Nagle, but in most cases that's just 
a sign that the application developer doesn't know how to use TCP properly.

The problem is reduced a bit by using TLS, because the TLS record and 
encryption overhead make some smaller packets bigger than the MSS, and thus not 
affected by Nagle. But it's still bad. (And from the application developer's 
point of view, TLS libraries typically make the problem harder to resolve, due 
to a lack of a gather-send operation.)
 
> Why is this so much more split up? And to what correspond these
> BIO_read chunks on the protocol level? Are these TLS records? TCP
> packets? Is there something horribly wrong with my server config
> because it splits them up in so many small parts?

I don't know enough about Apache configuration to say. (If Apache routinely 
does this, though, it's no wonder there are all those benchmarks showing better 
performance with Nginx. This is Sockets 101 stuff.)

In any case it's not "horribly wrong". It's just costing you bandwidth and 
latency.

-- 
Michael Wojcik
Technology Specialist, Micro Focus

-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Hanno Böck
On Tue, 26 Apr 2016 18:31:31 +0200
Rainer Jung  wrote:

> The second pattern looks like "Transfer-Encoding: chunked". In this 
> mode, a response is sent in chunks and each chunk is preceded by a
> hex number telling how big the next chunk is. The last chunk is
> followed by a "0" indicating no more chunks are expected. So the "2"
> is the size of the chunk size (two hex digits), next comes the chunk
> itself.
> 
> That sort of encoding is typically used for dynamic content, when the 
> final size of the response is not known in advance to avoid needing
> to buffer the whole response before sending it. It does not use a 
> content-length header. Another case might be a transformation during 
> response delivery that changes the size in a way that is not easy to 
> calculate in advance, like compression.

Thanks, that was it. if I look at the data coming that's exactly how it
looks like. (I still wonder why apache does that - for a 404 error
page - but at least now I know what's going on)

-- 
Hanno Böck
https://hboeck.de/

mail/jabber: ha...@hboeck.de
GPG: BBB51E42


pgpxK76e7wkmt.pgp
Description: OpenPGP digital signature
-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Matt Caswell


On 26/04/16 17:22, Karl Denninger wrote:
> It's split up due to the vagaries of TCP and how it delivers packets.
> 
> In short a local network connection will tend to deliver smaller packets
> of data than a distant one, all things being equal -- but they never
> are.  All you're guaranteed with TCP is a byte-stream that is coherent,
> as was noted in the earlier reply, but you are not guaranteed how many
> bytes will come at once.  When select() receives a ready state for
> reading or a blocking read returns there could be zero (if there's an
> error, exception, or in the case of SSL a possible protocol
> renegotiation) or more bytes available to read.
> 
> There is nothing wrong with either your server or the other end, it's
> just how it works.  Typically the difference is a matter of how things
> match up between how many bytes are received and buffered in your
> protocol stack before you read them .vs. how fast the other end can
> write them and get them to you, which for a wide-area connection usually
> involves a lot of routers in the middle.  With TCP there are additional
> confounding factors, since the protocol itself implements window control
> (size of outstanding transmissions that are allowed), sACK can come into
> play, latency of the circuit and routing points in the middle get
> involved, etc.  For wide-area connections (think Internet) slow-start
> congestion control (which helps avoid a fast server blasting data at a
> rate that could otherwise cause a buffer overflow somewhere in the
> middle, thus requiring a retransmit) also plays a part.


While that is true I don't think it explains the behaviour. A single TLS
record may get split up into multiple small TCP packets (dependent on
the vagaries of the network as you point out). But OpenSSL won't return
app data to the application until it has read the entire TLS record.

Matt

-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Rainer Jung

Am 26.04.2016 um 18:12 schrieb Hanno Böck:

Thanks for both your answers, that was very helpful (although it
probably means what I'm trying to do is more complicated than I
thought)...

One more question you might be able to answer:
When I run my test code and connect to google.com I get the following
bytes read for each BIO_read call:
1024
365
289

When I run these against my own server (relatively standard
apache2.4+openssl setup) I get very different numbers:
240
287
2
588
2
41
2
115
2
12
2
110
2
69
2
20
2
6
2
34
2
17
2
12
2
37
2
290
2
6
5

Why is this so much more split up? And to what correspond these
BIO_read chunks on the protocol level? Are these TLS records? TCP
packets? Is there something horribly wrong with my server config
because it splits them up in so many small parts?


The second pattern looks like "Transfer-Encoding: chunked". In this 
mode, a response is sent in chunks and each chunk is preceded by a hex 
number telling how big the next chunk is. The last chunk is followed by 
a "0" indicating no more chunks are expected. So the "2" is the size of 
the chunk size (two hex digits), next comes the chunk itself.


That sort of encoding is typically used for dynamic content, when the 
final size of the response is not known in advance to avoid needing to 
buffer the whole response before sending it. It does not use a 
content-length header. Another case might be a transformation during 
response delivery that changes the size in a way that is not easy to 
calculate in advance, like compression.


Since it is a bit of pattern guessing, you should check this by looking 
at the http response headers.


Still one could ask whether it is actually efficient to send the 
response in such small parts, but that's more a question on the sender.


Regards,

Rainer
--
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Karl Denninger
On 4/26/2016 11:12, Hanno Böck wrote:
> Thanks for both your answers, that was very helpful (although it
> probably means what I'm trying to do is more complicated than I
> thought)...
>
> One more question you might be able to answer:
> When I run my test code and connect to google.com I get the following
> bytes read for each BIO_read call:
> 1024
> 365
> 289
>
> When I run these against my own server (relatively standard
> apache2.4+openssl setup) I get very different numbers:
> 240
> 287
> 2
> 588
> 2
> 41
> 2
> 115
> 2
> 12
> 2
> 110
> 2
> 69
> 2
> 20
> 2
> 6
> 2
> 34
> 2
> 17
> 2
> 12
> 2
> 37
> 2
> 290
> 2
> 6
> 5
>
> Why is this so much more split up? And to what correspond these
> BIO_read chunks on the protocol level? Are these TLS records? TCP
> packets? Is there something horribly wrong with my server config
> because it splits them up in so many small parts?
>
>
>
It's split up due to the vagaries of TCP and how it delivers packets.

In short a local network connection will tend to deliver smaller packets
of data than a distant one, all things being equal -- but they never
are.  All you're guaranteed with TCP is a byte-stream that is coherent,
as was noted in the earlier reply, but you are not guaranteed how many
bytes will come at once.  When select() receives a ready state for
reading or a blocking read returns there could be zero (if there's an
error, exception, or in the case of SSL a possible protocol
renegotiation) or more bytes available to read.

There is nothing wrong with either your server or the other end, it's
just how it works.  Typically the difference is a matter of how things
match up between how many bytes are received and buffered in your
protocol stack before you read them .vs. how fast the other end can
write them and get them to you, which for a wide-area connection usually
involves a lot of routers in the middle.  With TCP there are additional
confounding factors, since the protocol itself implements window control
(size of outstanding transmissions that are allowed), sACK can come into
play, latency of the circuit and routing points in the middle get
involved, etc.  For wide-area connections (think Internet) slow-start
congestion control (which helps avoid a fast server blasting data at a
rate that could otherwise cause a buffer overflow somewhere in the
middle, thus requiring a retransmit) also plays a part.

-- 
Karl Denninger
k...@denninger.net 
/The Market Ticker/
/[S/MIME encrypted email preferred]/


smime.p7s
Description: S/MIME Cryptographic Signature
-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Matt Caswell


On 26/04/16 17:12, Hanno Böck wrote:
> Why is this so much more split up? And to what correspond these
> BIO_read chunks on the protocol level? Are these TLS records? TCP
> packets? Is there something horribly wrong with my server config
> because it splits them up in so many small parts?


Odd. OpenSSL should read and process whole records in one go. As long as
the size that you pass to BIO_read is bigger than the record size you
should get passed back to your code a whole record in one go.

You might want to do a wireshark capture and see what app data record
sizes are being passed back.

Matt



-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Salz, Rich
One thing you could do is do raw tcp reads and see what the read() call 
returns, at least for your local server.

It would well be network characteristics.
-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Hanno Böck
Thanks for both your answers, that was very helpful (although it
probably means what I'm trying to do is more complicated than I
thought)...

One more question you might be able to answer:
When I run my test code and connect to google.com I get the following
bytes read for each BIO_read call:
1024
365
289

When I run these against my own server (relatively standard
apache2.4+openssl setup) I get very different numbers:
240
287
2
588
2
41
2
115
2
12
2
110
2
69
2
20
2
6
2
34
2
17
2
12
2
37
2
290
2
6
5

Why is this so much more split up? And to what correspond these
BIO_read chunks on the protocol level? Are these TLS records? TCP
packets? Is there something horribly wrong with my server config
because it splits them up in so many small parts?

-- 
Hanno Böck
https://hboeck.de/

mail/jabber: ha...@hboeck.de
GPG: BBB51E42


pgpRMVHUyuZPY.pgp
Description: OpenPGP digital signature
-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Michael Wojcik
> From: openssl-users [mailto:openssl-users-boun...@openssl.org] On Behalf
> Of Matt Caswell
> Sent: Tuesday, April 26, 2016 10:06
> To: openssl-users@openssl.org
> Subject: Re: [openssl-users] BIO_read hangs, how can I know if the server
> wants to send data?
> 
> On 26/04/16 14:28, Hanno Böck wrote:
> >
> > What I want to do: Send a couple of HTTP requests over one connection
> > (with HTTP/1.1, keep-alive enabled).
> > Seems simple enough: I send a HTTP request and then read what the
> > server sends, then send the next.
> >
> > However: How do I know when the server has stopped sending?
> > I have attached a code sample (it's missing lots of error checking in
> > the initialization phase, but that's just for simplification of the
> > code and shouldn't matter for now).
> 
> There are a few ways of doing this:
> 
> 1) Track it at the application protocol layer. For example read the
> "Content-Length" HTTP header and wait until you've received that amount
> of data. This is probably the best way. The other ways below only tell
> you whether the network *currently* has any data to provide to you - not
> whether the server has finished sending.

A couple of points:

- This problem applies to any TCP-based application, regardless of whether TLS 
is used. TCP is a full-duplex byte-stream protocol. It does not provide any 
record-boundary or flow-direction indicators. I would strongly recommend 
consulting a good TCP communications reference such as Stevens' /UNIX Network 
Programming/ or Comer's /Internetworking with TCP/IP/.

- You can't rely on the presence of a Content-length header in the server's 
response. For HTTP/1.1, the ways in which a response can be delimited are:
- Some request types, such as HEAD, do not allow a message body in the 
response, regardless of what header lines were present. In this case the 
response is delimited by the blank line that follows the head.
- Some response types, notably the 1xx range, do not allow a message 
body in the response, and are delimited by the blank line at the end of the 
head.
- A response can be delimited by terminating the connection.
- A response can include a message body which is exactly the number of 
octets specified in the optional Content-length header.
- A response can be delimited by using a self-delimiting transfer 
encoding. In practice, this means using the "chunked" transfer-encoding, and 
indicating the end of the message body with a zero-length chunk. If trailers 
are allowed, the actual end of the response is the end of the trailers. If the 
chunked T-E is used, any Content-length header MUST be ignored.
- A response can be delimited by using a self-delimiting Content-type, 
such as MIME multipart types, if the client accepts such content.

Thus determining the end of an HTTP message is not trivial. See RFC 2616 for 
details.

It's even worse for HTTP/2, but then HTTP/2 is worse in many ways.

In practice, interactive HTTP user agents (browsers) use a combination of 
methods and heuristics, because they have to deal with broken servers, broken 
code running under servers, broken intermediary nodes (gateways and proxies), 
network problems, etc. Thus they try to apply the rules for determining the end 
of the response, but they also try to render data as it's received, and after a 
while they'll time out and decide that a message has ended.

-- 
Michael Wojcik
Technology Specialist, Micro Focus

-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


Re: [openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Matt Caswell
Hi Hanno

On 26/04/16 14:28, Hanno Böck wrote:
> Hi,
> 
> I have a problem here using OpenSSL, maybe I have some fundamental
> misunderstanding of how the api is supposed to be used.
> 
> What I want to do: Send a couple of HTTP requests over one connection
> (with HTTP/1.1, keep-alive enabled).
> Seems simple enough: I send a HTTP request and then read what the
> server sends, then send the next.
> 
> However: How do I know when the server has stopped sending?
> I have attached a code sample (it's missing lots of error checking in
> the initialization phase, but that's just for simplification of the
> code and shouldn't matter for now).

There are a few ways of doing this:

1) Track it at the application protocol layer. For example read the
"Content-Length" HTTP header and wait until you've received that amount
of data. This is probably the best way. The other ways below only tell
you whether the network *currently* has any data to provide to you - not
whether the server has finished sending.

2) Use non-blocking IO

3) Check the underlying file descriptor to see if is readable at the
moment. So for example you can call BIO_get_fd(), and then call
"select", or "poll" or similar on the file descriptor to see if it
readable. You may need to use SSL_pending()/SSL_has_pending() in
combination with this (see below).


> The relevant part is here:
>   for (i = 0; i < 5; i++) {
>   printf("calling BIO_write\n");
>   r = BIO_write(bio, request, strlen(request));
>   printf("%i bytes written\n", r);
>   do {
>   printf("calling BIO_read\n");
>   r = BIO_read(bio, buf, 1024);
>   printf("%i bytes read\n", r);
>   } while (r > 0);
>   }
> 
> Now when I run this code it sends one write and reads a couple of
> times. However when it's done BIO_read will block the program execution
> and not return until a timeout.
> 
> So I need a way to know that there's nothing to read before calling
> BIO_read. Searching the docs I thought SSL_pending() might be what I
> need. However it always returns zero, no matter if the server has
> something to send or not.

SSL_pending() only tells you whether OpenSSL has read data from the
network and processed it, but has not yet provided it to you. This might
happen, for example if OpenSSL received a record of application data but
you only read part of it in the last SSL_read call. Compare with
SSL_has_pending() (only in 1.1.0).


> 
> Another sidenote: I have set the timeout of the context to 2, but it
> still hangs for much longer, so the timeout value doesn't seem to have
> any effect.

SSL_CTX_set_timeout() sets the timeout of the *session*. It has nothing
to do with the connection.

Matt

-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users


[openssl-users] BIO_read hangs, how can I know if the server wants to send data?

2016-04-26 Thread Hanno Böck
Hi,

I have a problem here using OpenSSL, maybe I have some fundamental
misunderstanding of how the api is supposed to be used.

What I want to do: Send a couple of HTTP requests over one connection
(with HTTP/1.1, keep-alive enabled).
Seems simple enough: I send a HTTP request and then read what the
server sends, then send the next.

However: How do I know when the server has stopped sending?
I have attached a code sample (it's missing lots of error checking in
the initialization phase, but that's just for simplification of the
code and shouldn't matter for now).

The relevant part is here:
for (i = 0; i < 5; i++) {
printf("calling BIO_write\n");
r = BIO_write(bio, request, strlen(request));
printf("%i bytes written\n", r);
do {
printf("calling BIO_read\n");
r = BIO_read(bio, buf, 1024);
printf("%i bytes read\n", r);
} while (r > 0);
}

Now when I run this code it sends one write and reads a couple of
times. However when it's done BIO_read will block the program execution
and not return until a timeout.

So I need a way to know that there's nothing to read before calling
BIO_read. Searching the docs I thought SSL_pending() might be what I
need. However it always returns zero, no matter if the server has
something to send or not.

Another sidenote: I have set the timeout of the context to 2, but it
still hangs for much longer, so the timeout value doesn't seem to have
any effect.

I also tried a number of other things, including using SSL_read/write,
BIO_puts/gets (I didn't really find any good explanation when to use
which of the three), using a nonblocking bio (but that was totally
confusing) etc.

Any help apprechiated.

-- 
Hanno Böck
https://hboeck.de/

mail/jabber: ha...@hboeck.de
GPG: BBB51E42
#include 

int main()
{
	SSL_CTX *ctx;
	BIO *bio;
	SSL *ssl;
	char *buf[1024];
	int r, i;
	char *request = "GET / HTTP/1.1\r\nHost: x\r\n\r\n";

	SSL_library_init();
	SSL_load_error_strings();

	ctx = SSL_CTX_new(TLSv1_2_method());
	SSL_CTX_set_timeout(ctx, 2);

	bio = BIO_new_ssl_connect(ctx);
	BIO_set_conn_hostname(bio, "google.com:443");
	BIO_get_ssl(bio, &ssl);

	BIO_do_connect(bio);

	for (i = 0; i < 5; i++) {
		printf("calling BIO_write\n");
		r = BIO_write(bio, request, strlen(request));
		printf("%i bytes written\n", r);
		do {
			printf("calling BIO_read\n");
			r = BIO_read(bio, buf, 1024);
			printf("%i bytes read\n", r);
		} while (r > 0);
	}

}


pgpavpcJww3Rb.pgp
Description: OpenPGP digital signature
-- 
openssl-users mailing list
To unsubscribe: https://mta.openssl.org/mailman/listinfo/openssl-users