On Mon, Oct 12, 2015 at 11:17 AM, Shaun Thomas <bonesmo...@gmail.com> wrote:
> Hi guys, > > I've been doing some design investigation and ran into an interesting snag > I didn't expect to find on 9.4 (and earlier). I wrote a quick python script > to fork multiple simultaneous COPY commands to several separate tables and > found that performance apparently degrades based on how many COPY commands > are running. > > For instance, in the logs with one COPY, I see about one second to import > 100k rows. At two processes, it's 2 seconds. At four processes, it's 4 > seconds. This is for each process. Thus loading 400k rows takes 16 seconds > cumulatively. To me, it looked like some kind of locking issue, but > pg_locks showed no waits during the load. In trying to figure this out, I > ran across this discussion: > > > http://www.postgresql.org/message-id/cab7npqqjeasxdr0rt9cjiaf9onfjojstyk18iw+oxi-obo4...@mail.gmail.com > > Which came after this: > > http://forums.enterprisedb.com/posts/list/4048.page > > It would appear I'm running into whatever issue the xloginsert_slots patch > tried to address, but not much discussion exists afterwards. It's like the > patch just kinda vanished into the ether even though it (apparently) > massively improves PG's ability to scale data import. > > I should note that setting wal_level to minimal, or doing the load on > unlogged tables completely resolves this issue. However, those are not > acceptable settings in a production environment. Is there any other way to > get normal parallel COPY performance, or is that just currently impossible? > > I also know 9.5 underwent a lot of locking improvements, so it might not > be relevant. I just haven't gotten a chance to repeat my tests with 9.5 > just yet. > Can you provide the test script? Also, have you tuned your database for high io throughput? What is your storage system like?