Re: Add LIMIT option to COPY FROM

Shinya Kato Tue, 03 Feb 2026 17:07:26 -0800

Thank you for the comments.

On Tue, Feb 3, 2026 at 11:12 PM David G. Johnston
<[email protected]> wrote:
>> Syntax example:
>> - COPY t FROM STDIN (LIMIT 100);
>>
>> This feature is useful for:
>> - Loading only the first N rows from a huge CSV file to verify data or
>> table definitions before a full import
>
>
> Would want it paired with offset for this use case.

COPY FROM already accepts HEADER <integer> to skip N leading lines, so
you can combine it with LIMIT to get OFFSET + LIMIT semantics today:
=# COPY t FROM 'file.csv' (HEADER 100, LIMIT 50);

>> Design:
>> - The LIMIT count applies after WHERE filtering and ON_ERROR skipping,
>> so it represents the actual number of rows inserted.
>
>
> Not sure about that choice.  I’d go with pre-eval or implement both and 
> default to pre-eval.

It is consistent with SQL semantics — SELECT ... WHERE ... LIMIT N
counts rows that pass the filter, not rows scanned. Pre-eval behavior
is already achievable externally (head -n), while post-eval can only
be done server-side, which makes it the more valuable choice for a
built-in option.

On Tue, Feb 3, 2026 at 11:41 PM Tom Lane <[email protected]> wrote:
>
> Shinya Kato <[email protected]> writes:
> > I'd like to propose adding a LIMIT option to COPY FROM, which limits
> > the number of rows to load.
>
> Do we really need this?  Each random feature we load onto COPY
> slows it down for everybody.

When LIMIT is not specified (the default), the only overhead is a
single if (limit > 0) branch per row on a struct field already in L1
cache — the same class of cost as the existing ON_ERROR and WHERE
checks.

This is not a novel feature either. Oracle SQL*Loader provides LOAD,
and SQL Server BULK INSERT provides LASTROW. The absence of LIMIT in
COPY FROM is arguably a gap relative to other RDBMSes.

On Wed, Feb 4, 2026 at 12:07 AM Florents Tselai
<[email protected]> wrote:
> I work with such scenarios a lot and I can't see why COPY should worry itself 
> about such filtering.
> IRL most of what goes into COPY'S STDIN  has already been filtered 
> extensively,
> like ... | head -n 100 | COPY t FROM STDIN

head -n works for the STDIN-pipe workflow, but not for all COPY FROM scenarios:
- Server-side files via remote psql: COPY t FROM
'/server/path/file.csv' is read directly by the server process. A
client connected over the network has no way to interpose a pipe on
that I/O path.
- Interaction with WHERE / ON_ERROR: head -n 100 limits input lines,
but cannot guarantee a specific number of inserted rows when some rows
are filtered by WHERE or skipped by ON_ERROR. That control is only
possible server-side.

The same "do it outside" argument could be made against the WHERE
clause ("just use grep"), yet WHERE was accepted because server-side
filtering provides value that external tools cannot fully replicate.
LIMIT fills the same kind of gap.

-- 
Best regards,
Shinya Kato
NTT OSS Center

Re: Add LIMIT option to COPY FROM

Reply via email to