> "Testing" db interaction in a faked, circumscribed > only-my-stuff-is-there world is folly. Certainly each db developer > needs their own instance of the database (on their own box or a > server). And it needs to be kept current with both DDL and domain meta > data changes (see things like flyway) as regularly as is the source > code. It should have a decent representation of a production dataset > else reads and writes will always be fast. All the tests reading and > writing all the columns of all the tables generates a lot of "green > lights" but near zero practicable information in the developers' workflow.
This is not the extent of the testing we plan on using. In this email thread, I'm only referring to fast unit tests that ensure the correctness of the application's logic and behavior. These are paired with the proper, hygienic things you called out. > > Were I a betting man, I would bet heavily against this community, which > > prides itself on NOT losing data, allowing an option that would do just > > that. > Well, mumble ... we have any number of options that can be read that way. One > obvious one is that we don't try to prevent you from putting $PGDATA on a RAM > disk. Turning off fsync is another popular way to trade away durability for > speed. We already do both for unit tests, which helps a bunch :). > But I concur with the point that we're not here to pretend to be an embedded > database, as there are other projects that do that better (for example, our > good friends at SQLite). > The advice I'd give the OP is to take a look at our TAP-test infrastructure. > We've put a good deal of effort, and are continuing to do so, into letting > those tests spin up transitory testing databases pretty cheaply. Thanks! I'll dig in. I'm guessing you're referring to these? https://www.postgresql.org/docs/current/regress-tap.html For what it's worth, I don't think Postgres _should_ be an embedded database, but I think there are some qualities of embedded databases that I would love to see in Postgres: namely, the self-contained cleanup. Regardless, I worry that me introducing SQLite into this discussion was a mistake and hurt the coherency of my request, so I apologize for that. > SQLite is fantastic, but it's type-system and lock-model are too restrictive, > for a general DB. Similar to David, I think PostgreSQL is close to my ideal > above, yet still far-enough (and perhaps unwilling enough, as a community) > to venture into embedded and localhost use-cases, that it's frustrating. Yup, Dominique understands what I want: Postgres' type system, query planner, and locking model, but _shades_ of SQLite's operational properties during local development. However, I don't really need Postgres to function like an embedded database; I just want the self-contained process cleanup. Connecting to Postgres over TCP over localhost is perfect for my use-case: the difference in performance for an in-process database (à la SQLite) vs. connecting over localhost is the difference between single-digit microseconds and single-digit milliseconds. That difference matters in some cases, but not here: as far as a human running tests is concerned, both are instant. Here's someone at CrunchyData/Snowflake providing an experience report of this exact workflow: https://www.crunchydata.com/blog/dont-mock-the-database-data-fixtures-are-parallel-safe-and-plenty-fast Anyways, I'll try to get at what motivated this whole discussion: would there be community opposition to adding a CLI flag that'd exit/shutdown all Postgres processes once all pending connections close? E.g., something similar to SQL Server's `auto_close` in the vein of `postgres -c "auto_close_after=100"` or `pg-ctl start --exit-mode=smart`? On Oct 13, 2025 at 9:43:15 AM, Dominique Devienne <[email protected]> wrote: > On Mon, Oct 13, 2025 at 5:37 PM Tom Lane <[email protected]> wrote: > > Dominique Devienne <[email protected]> writes: > > > On Mon, Oct 13, 2025 at 5:08 PM Ron Johnson <[email protected]> > wrote: > > >> Shared memory means that I can stomp all over you, and you can't stop > me. That's the antithesis of ACID. > > > > SHM is how SQLite in WAL mode coordinates access to the same DB from > > > several connections. So if it's good enough for SQLite, I don't see > > > what it would be wrong for PostgreSQL too. > > > SQLite has accepted the cost that comes with being embedded, which is > > that application-side memory-stomping bugs can destroy the database. > > Postgres is not willing to make that tradeoff. From a pure > > developer's perspective, every time we got a bug report we'd have to > > ask "did you observe this while running embedded?" and then demand a > > repro that uses a non-embedded database. We are not going to help > > application authors debug their own bugs, especially not when we have > > no visibility into what those are. > > > That's true for the embedded case, true. > > There are some of those, on the ML (custom SQLite-based forum in > fact), but not that many in fact, far from it. So that concern does > exist, but maybe not to the extent one fears. > > But not for the localhost case, which remains "client-server" > (multi-process). And SHM is then one of the options for the > "transport" between the libpq-based client, and the backends (running > on localhost). Unix Socket on Linux is almost perfect for the > localhost case, but again, is not portable. And I'd need simplified > authN, on the fly start if necessary, that kind of thing. Our apps are > multi-process themselves too, and each process can also be > multi-connection. In the localhost case, the data is private to you, > but can still be accessed concurrently across connections (from one or > more processes). And in that case, we shouldn't have to deal with > passwords, and everything should run as the OS user. > > > SQLite is also ACID. > > > I guess they have a different set of assumptions about what that > > buzzword means. > > > As you wrote, there are existing footguns one can turn on to weaken > ACID already. PostgreSQL is superior to SQLite in many ways. I get > that the embedded use-case is a step too far, for a long time, but the > localhost case, for testing but also for localhost private-data > serving (a possibly cache of a larger remote server) is much more > attainable. And valuable IMHO. > > >
