On Wed, May 5, 2021, at 20:45, Tom Lane wrote: > "Joel Jacobson" <j...@compiler.org <mailto:joel%40compiler.org>> writes: > > I think you misunderstood the problem. > > I don't want the entire file to be considered a single value. > > I want each line to become its own row, just a row with a single column. > > > So I actually think COPY seems like a perfect match for the job, > > since it does precisely that, except there is no delimiter in this case. > > Well, there's more to it than just the column delimiter. > > * What about \N being converted to NULL? > * What about \. being treated as EOF? > * Do you want to turn off the special behavior of backslash (ESCAPE) > altogether? > * What about newline conversions (\r\n being seen as just \n, etc)? > > I'm inclined to think that "use pg_read_file and then split at newlines" > might be a saner answer than delving into all these fine points. > Not least because people yell when you add cycles to the COPY > inner loops.
Thanks for providing strong arguments why the COPY approach is a dead-end, I agree. However, as demonstrated in my previous email, using string_to_table(pg_read_file( filename ), E'\n') has its performance as well as max size issues. Maybe these two problems could be solved by combining the two functions into one? file_to_table ( filename text, delimiter text [, null_string text ] ) → setof text I'm thinking thanks to returning "setof text", such a function could read a stream, and return a line as soon as a delimiter is encountered, not having to keep the entire file in memory at any time. /Joel