Are the inserts being done through one connection or multiple connections concurrently?
Sent from my iPhone > On Dec 24, 2017, at 2:51 PM, Jean Baro <jfb...@gmail.com> wrote: > > Hi there, > > We are testing a new application to try to find performance issues. > > AWS RDS m4.large 500GB storage (SSD) > > One table only, called Messages: > > Uuid > Country (ISO) > Role (Text) > User id (Text) > GroupId (integer) > Channel (text) > Title (Text) > Payload (JSON, up to 20kb) > Starts_in (UTC) > Expires_in (UTC) > Seen (boolean) > Deleted (boolean) > LastUpdate (UTC) > Created_by (UTC) > Created_in (UTC) > > Indexes: > > UUID (PK) > UserID + Country (main index) > LastUpdate > GroupID > > > We inserted 160MM rows, around 2KB each. No partitioning. > > Insert started at around 3.000 inserts per second, but (as expected) started > to slow down as the number of rows increased. In the end we got around 500 > inserts per second. > > Queries by Userd_ID + Country took less than 2 seconds, but while the batch > insert was running the queries took over 20 seconds!!! > > We had 20 Lambda getting messages from SQS and bulk inserting them into > Postgresql. > > The insert performance is important, but we would slow it down if needed in > order to ensure a more flat query performance. (Below 2 seconds). Each query > (userId + country) returns around 100 diferent messages, which are filtered > and order by the synchronous Lambda function. So we don't do any special > filtering, sorting, ordering or full text search in Postgres. In some ways we > use it more like a glorified file system. :) > > We are going to limit the number of lambda workers to 1 or 2, and then run > some queries concurrently to see if the query performance is not affect too > much. We aim to get at least 50 queries per second (returning 100 messages > each) under 2 seconds, even when there is millions of messages on SQS being > inserted into PG. > > We haven't done any performance tuning in the DB. > > With all that said, the question is: > > What can be done to ensure good query performance (UserID+ country) even when > the bulk insert is running (low priority). > > We are limited to use AWS RDS at the moment. > > Cheers > >