Are the inserts being done through one connection or multiple connections 
concurrently?

Sent from my iPhone

> On Dec 24, 2017, at 2:51 PM, Jean Baro <jfb...@gmail.com> wrote:
> 
> Hi there,
> 
> We are testing a new application to try to find performance issues.
> 
> AWS RDS m4.large 500GB storage (SSD)
> 
> One table only, called Messages:
> 
> Uuid
> Country  (ISO)
> Role (Text)
> User id  (Text)
> GroupId (integer)
> Channel (text)
> Title (Text)
> Payload (JSON, up to 20kb)
> Starts_in (UTC)
> Expires_in (UTC)
> Seen (boolean)
> Deleted (boolean)
> LastUpdate (UTC)
> Created_by (UTC)
> Created_in (UTC)
> 
> Indexes:
> 
> UUID (PK)
> UserID + Country (main index)
> LastUpdate 
> GroupID 
> 
> 
> We inserted 160MM rows, around 2KB each. No partitioning.
> 
> Insert started at around  3.000 inserts per second, but (as expected) started 
> to slow down as the number of rows increased.  In the end we got around 500 
> inserts per second.
> 
> Queries by Userd_ID + Country took less than 2 seconds, but while the batch 
> insert was running the queries took over 20 seconds!!!
> 
> We had 20 Lambda getting messages from SQS and bulk inserting them into 
> Postgresql. 
> 
> The insert performance is important, but we would slow it down if needed in 
> order to ensure a more flat query performance. (Below 2 seconds). Each query 
> (userId + country) returns around 100 diferent messages, which are filtered 
> and order by the synchronous Lambda function. So we don't do any special 
> filtering, sorting, ordering or full text search in Postgres. In some ways we 
> use it more like a glorified file system. :)
> 
> We are going to limit the number of lambda workers to 1 or 2, and then run 
> some queries concurrently to see if the query performance is not affect too 
> much. We aim to get at least 50 queries per second (returning 100 messages 
> each) under 2 seconds, even when there is millions of messages on SQS being 
> inserted into PG.
> 
> We haven't done any performance tuning in the DB. 
> 
> With all that said, the question is:
> 
> What can be done to ensure good query performance (UserID+ country) even when 
> the bulk insert is running (low priority).
> 
> We are limited to use AWS RDS at the moment.
> 
> Cheers
> 
> 


Reply via email to