By the way, I'm unable to add the patch to the next commitfest due to the cool off period for new accounts. How long is that period? I don't suppose there's a way to avoid it?
On Mon, Feb 12, 2024 at 11:31 AM David Benjamin <david...@google.com> wrote: > On Mon, Feb 12, 2024 at 9:38 AM Daniel Gustafsson <dan...@yesql.se> wrote: > >> > On 11 Feb 2024, at 19:19, David Benjamin <david...@google.com> wrote: >> > It turns out the parts that came from the OpenSSL socket BIO were a >> no-op, and in fact PostgreSQL is relying on it being a no-op. Instead, it's >> cleaner to just define a custom BIO the normal way, which then leaves the >> more standard BIO_get_data mechanism usable. This also avoids the risk that >> a future OpenSSL will add a now BIO_ctrl to the socket type, with libssl >> calling into it, and then break some assumptions made by PostgreSQL. >> >> + case BIO_CTRL_FLUSH: >> + /* libssl expects all BIOs to support BIO_flush. >> */ >> + res = 1; >> + break; >> >> Will this always be true? Returning 1 implies that we have flushed all >> data on >> the socket, but what if we just before called BIO_set_retry..XX()? >> > > The real one is also just a no-op. :-) > https://github.com/openssl/openssl/blob/master/crypto/bio/bss_sock.c#L215 > > This is used in places like buffer BIO or the FILE* BIO, where BIO_write > might accept data, but stage it into a buffer, to be flushed later. libssl > ends up calling BIO_flush at the end of every flight, which in turn means > that BIOs used with libssl need to support it, even if to return true > because there's nothing to flush. (Arguably TCP sockets could have used a > flush concept, to help control Nagle's algorithm, but for better or worse, > that's a socket-wide TCP_NODELAY option, rather than an explicit flush > call.) > > BIO_set_retry.. behaves like POSIX I/O, where a failed EWOULDBLOCK write > is as if you never wrote to the socket at all and doesn't impact > socket state. That is, the data hasn't been accepted yet. It's not expected > for BIO_flush to care about the rejected write data. (Also I don't believe > libssl will ever trigger this case.) It's confusing because unlike an > EWOULDBLOCK errno, BIO_set_retry.. *is* itself BIO state, but that's just > because the BIO calling convention is goofy and didn't just return the > error out of the return value. So OpenSSL just stashes the bit on the BIO > itself, for you to query out immediately afterwards. > > >> > I've attached a patch which does that. The existing SSL tests pass with >> it, tested on Debian stable. (Though it took me a few iterations to figure >> out how to run the SSL tests, so it's possible I've missed something.) >> >> We've done a fair bit of work on making them easier to run, so I'm >> curious if >> you saw any room for improvements there as someone coming to them for the >> first >> time? >> > > A lot of my time was just trying to figure out how to run the tests in > the first place, so perhaps documentation? But I may just have been looking > in the wrong spot and honestly didn't really know what I was doing. I can > try to summarize what I did (from memory), and perhaps that can point to > possible improvements? > > - I looked in the repository for instructions on running the tests and > couldn't find any. At this point, I hadn't found src/test/README. > - I found > https://wiki.postgresql.org/wiki/Developer_FAQ#How_do_I_test_my_changes.3F, > linked from https://www.postgresql.org/developer/ > - I ran the configure build with --enable-cassert, ran make check, tests > passed. > - I wrote my patch and then spent a while intentionally adding bugs to see > if the tests would catch it (I wasn't sure whether there was ssl test > coverage), finally concluding that I wasn't running any ssl tests > - I looked some more and found src/test/ssl/README > - I reconfigured with --enable-tap-tests and ran make check > PG_TEST_EXTRA=ssl per those instructions, but the SSL tests still weren't > running > - I grepped for PG_TEST_EXTRA and found references in the CI config, but > using the meson build > - I installed meson, mimicked a few commands from the CI. That seemed to > work. > - I tried running only the ssl tests, looking up how you specify > individual tests in meson, to make my compile/test cycles a bit faster, but > they failed. > - I noticed that the first couple "tests" were named like setup tasks, and > guessed that the ssl tests depended on this setup to run. But by then I > just gave up and waited out the whole test suite per run. :-) > > Once I got it running, it was quite smooth. I just wasn't sure how to do > it. > > David >