Re: how does postgresql handle LOB/CLOB/BLOB column data that dies before the query ends

2023-02-27 Thread Noel Grandin
On Sat, 25 Feb 2023 at 19:06, Tom Lane  wrote:

> That could be a piece of the puzzle, yeah.
>
>
Thank you very much, this conversion has been a great help.

Regards, Noel Grandin


Re: how does postgresql handle LOB/CLOB/BLOB column data that dies before the query ends

2023-02-25 Thread Noel Grandin
On Sat, 25 Feb 2023 at 08:33, Tom Lane  wrote:

> Yeah, Postgres has an analogous kind of problem.  Our standard way to
> use "large objects" is to store their identifying OIDs in tables,
>
...

> and in particular they can *not* close the transaction that read the
> OID if they'd like to read a matching state of the large object.
> So far there's not been a lot of complaints about that ...
>
>
OK, so it seems like so far my design is not far off the PostgreSQL design
(which is very comforting).

I wonder if the difference is in the client<->server protocol.

Does PostgreSQL hold the transaction open until the client side has closed
the resultset (or the query object possibly, not sure about the PostgreSQL
API here).
H2 has a very simple client-server protocol, which means the client simply
sends a query and gets back a result-set stream, and there is no explicit
acknowledgement of when the client closes the resultset, which means that
the MVCC transaction is typically closed by the time the client even starts
reading the resultset.


Re: how does postgresql handle LOB/CLOB/BLOB column data that dies before the query ends

2023-02-24 Thread Noel Grandin
Thanks for the answers.

So, H2, like PostgreSQL, also internally has (a) an MVCC engine and (b)
LOBs existing as a on-the-side extra thing.

On Fri, 24 Feb 2023 at 17:39, Tom Lane  wrote:

> Postgres doesn't really do LOB in the same sense that some other DBs
> have, so you'd need to specify what you have in mind in Postgres
> terms to get a useful answer.
>

So, specifically, the primary problem we have is this:

(1) A typical small query returns all of its data in a stream to the client
(2) which means that, from the server's perspective, the transaction is
closed the moment the last record in the stream is pushed to the client.
(3) which means that, in the face of concurrent updates, the underlying
MVCC data in the query might be long-gone from the server by the time the
client has finished reading the result set.
(4) However, with LOBs, the client doesn't get the LOB in the result set
data stream, it gets a special identifier (a hash), which it uses to fetch
LOB data from the server in chunks
(5) Which means that the lifetime of an individual LOB is just horrible
At the moment the implementation I have satisfies the needs of clients in
terms of correctness (crosses fingers), but is horrible in terms of
performance because of how long it has to keep LOB data around.


how does postgresql handle LOB/CLOB/BLOB column data that dies before the query ends

2023-02-24 Thread Noel Grandin
Hi

Hacker from another open-source DB here (h2database.com).

How does postgresql handle the following situation?

(1) a table containing a LOB column
(2) a query that does
   ResultSet rs = query("select lob_column from table_foo");
  while (rs.next())
  {
  retrieve_lob_data(rs.getLob(1));
   very long running stuff here..
   }

In the face of concurrent updates that might overwrite the existing LOB
data, how does PostgresQL handle this?

Does it keep the LOB data around until the ResultSet/Connection is closed?
Or does it impose some extra constraint on the client side? e.g..
explicitly opening and closing a transaction, and only wipe the "old" LOB
data when the transaction is closed?

I ask because I have implemented two of the four LOB implementations that
H2 has used, and we are still having trouble :-(

Regards, Noel.