On 5/26/21 12:50 PM, Rob Sargent wrote:
I have what purports to be Postgres 12 ( PostgreSQL 12.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-11), 64-bit ) running on RDS.  I testing a new release of my service with an admittedly large data set (which may be my swamping AWS bandwidth).  But my app logs tell me a table was created and loaded (and later, read)

From my tomcat log

    00:09:58.567 [https-jsse-nio-10.0.2.28-15002-exec-3] DEBUG
    edu.utah.camplab.jx.PayloadFromMux - STAGING TABLE CREATED:
    bulk."rjs_GEV15_15_ff3ba73c_d927_431c_bd29_9687a47f1594" 00:09:58.585
    [https-jsse-nio-10.0.2.28-15002-exec-3] INFO
    edu.utah.camplab.jx.PayloadFromMux -
    ff3ba73c-d927-431c-bd29-9687a47f1594: started COPY work at
    1621987798585 00:09:58.586 [ff3ba73c-d927-431c-bd29-9687a47f1594] INFO
    edu.utah.camplab.jx.PayloadWriterThread -
    bulk."rjs_GEV15_15_ff3ba73c_d927_431c_bd29_9687a47f1594": Begin bulk
    copy segment 00:10:01.597 [https-jsse-nio-10.0.2.28-15002-exec-11]
    ERROR edu.utah.camplab.servlet.PayloadSaveServlet - received payload
    00:10:01.598 [https-jsse-nio-10.0.2.28-15002-exec-11] DEBUG
    org.jooq.impl.DefaultConnectionProvider - setting auto commit : false
    00:10:01.599 [https-jsse-nio-10.0.2.28-15002-exec-11] DEBUG
    edu.utah.camplab.jx.PayloadFromMux -
    074d449b-c3ba-499f-83e3-f48427fe0156: start transaction at
    1621987801599 00:10:01.599 [https-jsse-nio-10.0.2.28-15002-exec-11]
    DEBUG org.jooq.impl.DefaultConnectionProvider - set savepoint
    00:10:07.511 [ff3ba73c-d927-431c-bd29-9687a47f1594] INFO
    edu.utah.camplab.jx.PayloadWriterThread - bulk transfer of 2528447
    took 8.925s 00:10:07.511 [ff3ba73c-d927-431c-bd29-9687a47f1594] DEBUG
    edu.utah.camplab.jx.PayloadWriterThread - staged in 8925 ms
    00:10:07.567 [https-jsse-nio-10.0.2.28-15002-exec-3] INFO
    edu.utah.camplab.jx.PayloadFromMux -
    ff3ba73c-d927-431c-bd29-9687a47f1594: Total segment save took 9486 ms
    00:10:07.567 [https-jsse-nio-10.0.2.28-15002-exec-3] INFO
    edu.utah.camplab.jx.AbstractPayload -
    ff3ba73c-d927-431c-bd29-9687a47f1594: closing process
    ff3ba73c-d927-431c-bd29-9687a47f1594 00:10:07.608
    [https-jsse-nio-10.0.2.28-15002-exec-3] DEBUG
    org.jooq.impl.DefaultConnectionProvider - release savepoint
    00:10:07.609 [https-jsse-nio-10.0.2.28-15002-exec-3] DEBUG
    edu.utah.camplab.jx.PayloadFromMux -
ff3ba73c-d927-431c-bd29-9687a47f1594: end transaction at 1621987807609 Which claims to have written 2,528,447 records in roughly 9 seconds into the newly created table "bulk.rjs.GEV15_15_FF3ba73c_d927_431c_bd29_9687147f1594". Nice.

However, no such table exists, though later processing renames it by appending "_done" to the name (being careful to remain under 64 char)

My middleware does receive an exception notice

    00:10:55.101 [https-jsse-nio-10.0.2.28-15002-exec-3] ERROR
    edu.utah.camplab.jx.AbstractPayload - run
    ff3ba73c-d927-431c-bd29-9687a47f1594: Exception from db write: SQL
    [insert into segment select * from
    bulk."rjs_GEV15_15_ff3ba73c_d927_431c_bd29_9687a47f1594" as s where
    s.probandset_id >= 'a0000000-0000-0000-0000-000000000000' and
    s.probandset_id < 'b0000000-0000-0000-0000-000000000000' ]; An I/O
error occurred while sending to the backend.: {}

which confirms(?) that the table /was/ there and read from 10 times prior since I copy from this temp, bulk loaded table to the actual target in 16 chunks by diddling the first hex digit from 0 through f.  Here zero through nine apparently didn't have a problem.  These 16 inserts are in a single transaction, separate from the bulk copy.  (There are a dozen more  of these files processed and disappeared.)

My question is:
Should I be surprised that the initial bulk loaded table is nowhere to be seen, given the thumbs up from the logs?  Is this frailty inherent in AWS/RDS infrastructure?

Since this is an academic exorcise, I have minimal AWS support, which has yet to chime in on this matter. My access to the logs is um, er, constrained.

The big differences I notice are:

1. "postgres" is not a superuser,
2. viewing logs is a hassle.

Otherwise, they're really similar.  We've pumped about 6TB of data into an instance, and it's been rock solid.  JBoss is quite happy, and there haven't been any problems.

--
Angular momentum makes the world go 'round.

Reply via email to