On 4/11/23 14:37, Federico wrote:
The problem here is not having the auto increment id in a particular
order, is that there
is apparently no correlation with the position of an element in the
values clause with the
id generated. That's the reason for using the sentinel column in the
general solution in the previous message.
The extend on the use case, SQLAlchemy has 3 objects T that have
T(data='a'), T(data='b'), T(data='c') but no
value for the id column. The objective is to insert the 3 data values,
get back the ids and correctly match them with
the correct 3 objects.
No. Sadly, adding that ORDER BY is just voodoo programming, because
it applies to the result of the SELECT while promising nothing about
the order in which INSERT/RETURNING will act on those rows.
I wasn't probably clear, it's fine if INSERT/RETURNING order is
arbitrary, what matters is that the
autoincementing values is executed in the same order as select, like
mentioned in this
previous message
https://www.postgresql.org/message-id/29386.1528813619%40sss.pgh.pa.us
Is that not the case?
Re-reading that 2012 thread, the main new observation I'd make today
is that parallel operation is a thing now, and it's not hard to foresee
that sometime soon we'll want to parallelize INSERTs. Which'd make it
*really* hard to promise anything about the order of RETURNING output.
I think it's fine not promising anything about the order of RETURNING, but
it would be very helpful having a way of tracking what input row
generated a particular
output row. Basically the sentinel case in the original post,
without actually having to insert the sentinel into the table.
I think if you want to use RETURNING with multi-row inserts, the
thing to do is more like
INSERT INTO t(data) VALUES ('a'), ('b'), ('c') RETURNING data, id
and then explicitly match up the returned "data" values rather than
presuming they appear in the same order you wrote them in in VALUES.
Admittedly this might be problematic if some of the VALUES rows
are identical, but how much should you care?
Well, the example is very easy, but it's hard to generalize when
inserting multiple columns
with possible complex values in them, since it would mean matching on
possibly large json values,
arrays, etc. So definitely not ideal
Thanks,
Federico
Can your client retain a hashmap of md5,data pairings, allowing the
lookup on the way back using the returned data and supplied id?