Jim, Thanks for the suggestions, they're excellent.
Partly, I'm using this current task as an opportunity to dig into the subject of bulk imports in Postgres. So far, I've figured out: * Prepared statements don't deliver any real speed gain. * Yes, bundling operations into transactions with explicit BEGIN; and COMMIT; statements helps a lot. Shipping versions of Postgres use a transaction invisibly for each statement, which makes lots of individual inserts a bit expensive. * DROP index before import and CREATE index after import is a real time-saver. It's not hard to automate the process. I have decided to leave unique indexes in place as, well, they're important and otherwise you have to drop table constraints. And as the docs say, "think twice." I had to spend a bit of time to work out the full UPSERT (INSERT ON CONFLICT) syntax in Postgres. At first my reactin was, eh? I was used to MySQL where it's pretty easy. In Postgres? A lot more complicated-looking. Once I understood the reasoning, I was pretty happy with what Postgres offers in this respect. You can, for example, do something like this: -- Add this row if it doesn't exist -- If it does exist, update it with the new data -- unless the new data is the same as the old data It's pretty easy to do, particularly with some advanced planning. (I'm using a checksum on the values to distinguish between changed and unchanged rows.) Next up was to retest multi-value inserts and COPY...and then look at files, but you guys are getting me excited about files now. There's likely to be a lot of importing on a regular basis, so it seems fruitful to check out the details now. Ultimately, Postgres is just another piece of infrastructure/plumbing that I just want to get out of the way. But given the potential scale of the data, ignorant design choices could be pretty painful to redo. For the record 4D is for data grinding and prep - raw transactional/relational data. Postgres is serving as a data warehouse with summary data and extracts Analytics will be done by something else So, it's a data pipeline with 4D at the first step (importing, cleaning and collating source data) and Postgres at the next step (storing and exposing reporting/analysis data.) Loving the great answers, much appreciated! ********************************************************************** 4D Internet Users Group (4D iNUG) FAQ: http://lists.4d.com/faqnug.html Archive: http://lists.4d.com/archives.html Options: http://lists.4d.com/mailman/options/4d_tech Unsub: mailto:4d_tech-unsubscr...@lists.4d.com **********************************************************************