2016-06-09 15:31 GMT+02:00 Merlin Moncure <mmonc...@gmail.com>: > On Wed, Jun 8, 2016 at 1:04 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: > > Michael Paquier <michael.paqu...@gmail.com> writes: > >> On Tue, Jun 7, 2016 at 10:03 PM, Josh Berkus <j...@agliodbs.com> wrote: > >>> On 06/07/2016 08:42 AM, Nicolas Paris wrote: > >>>> Will this 1GO restriction is supposed to increase in a near future ? > > > >>> Not planned, no. Thing is, that's the limit for a field in general, > not > >>> just JSON; changing it would be a fairly large patch. It's desireable, > >>> but AFAIK nobody is working on it. > > > >> And there are other things to consider on top of that, like the > >> maximum allocation size for palloc, the maximum query string size, > >> COPY, etc. This is no small project, and the potential side-effects > >> should not be underestimated. > > > > It's also fair to doubt that client-side code would "just work" with > > no functionality or performance problems for such large values. > > > > I await with interest the OP's results on other JSON processors that > > have no issues with GB-sized JSON strings. > > Yup. Most json libraries and tools are going to be disgusting memory > hogs or have exponential behaviors especially when you consider you > are doing the transformation as well. Just prettifying json documents > over 1GB can be a real challenge. > > Fortunately the workaround here is pretty easy. Keep your query > exactly as is but remove the final aggregation step so that it returns > a set. Next, make a small application that runs this query and does > the array bits around each row (basically prepending the final result > with [ appending the final result with ] and putting , between rows). >
The point is when prepending/appending leads to deal with strings. Transforming each value of the resultset to a string implies to escape the double quote. then: row1 contains {"hello":"world"} step 1 = prepend -> "[{\"hello\":\"world\"}" step 2 = append -> "[{\"hello\":\"world\"}," and so on the json is corrupted. Hopelly I am sure I am on a wrong way about that. > It's essential that you use a client library that does not buffer the > entire result in memory before emitting results. This can be done in > psql (FETCH mode), java, libpq (single row mode), etc. I suspect > node.js pg module can do this as well, and there certainty will be > others. > > The basic objective is you want the rows to be streamed out of the > database without being buffered. If you do that, you should be able > to stream arbitrarily large datasets out of the database to a json > document assuming the server can produce the query. > > merlin >