(continuing an old thread from https://www.postgresql.org/message-id/CAMsr%2BYEtamQYZ5EocsuthQCvyvmRnQrucDP6GZynPtf0gsMbuw%40mail.gmail.com )
Craig Ringer <craig(at)2ndquadrant(dot)com>, 2018-02-26: > On 26 February 2018 at 04:05, Anthony Communier <anthony(dot)communier(at)gmail(dot)com> wrote: > > > It would be nice if application connected to a Postrgesql database could > > send and receive JSONB in binary. It could save some useless text > > conversion. All works could be done on application side which are often > > more scalable than database itself. > > To support this, you'd need to extract PostgreSQL's jsonb support into a C > library that could be used independently of backend server infrastructure > like 'palloc' and memory contexts, ereport(), etc. Or write a parallel > implementation. At Shift, we also have use cases that would likely be sped up quite a bit if we could avoid the conversion from JSONB to JSON, and instead pass binary JSONB to the application side and parse it there (in Go). I doubt we'd want to reuse any of Postgres's C code, and would instead go with your "parallel implementation" idea; I can't imagine it being particularly difficult to implement a JSONB parser from scratch. All we need, I think, is a Postgres function raw_jsonb(jsonb) that returns bytea but is the identity function at the byte level. (Or allow a cast from jsonb to bytea.) Our Go code would then send queries like SELECT col1, col2, raw_jsonb(col3) FROM table1 WHERE ...; I haven't thought in depth about how we'd parse the JSONB in Go, but perhaps we can synthesize a stream of JSON tokens from the binary JSONB (one token at a time, to avoid copies and allocations) and adapt the streaming parser https://github.com/json-iterator/go to turn it into Go values. Sending raw JSONB to Postgres might also be interesting, but I'd start with receiving. Would implementing raw_jsonb be as trivial as it sounds? What about usages like SELECT raw_jsonb(col3->'foo'); does the subobject returned by '->' share structure with the containing object, making the conversion to a self-contained JSONB value less direct? Can these conversions be implemented without copying the bytes? An open question about the API contract would be how raw_jsonb would be affected if Postgres introduces a version 2 of JSONB encoding. My intuition is to punt that problem to the application, and define that raw_jsonb returns whatever version of JSONB is most convenient for Postgres for that particular datum; this minimizes conversion work on the Postgres side, which is the purpose of the mechanism. Applications that want a stable format can use the conventional textual JSON format. But I could see a case for making the function raw_jsonb(int, jsonb) and allowing the caller to specify what (maximum?) version of JSONB they want.