> From: Tory M Blue <tmb...@gmail.com>
>To: Jan Wieck <j...@wi3ck.info> 
>Cc: slony <slony1-general@lists.slony.info>
>Sent: Wednesday, 3 August 2016, 21:45
>Subject: Re: [Slony1-general] Slony Tuning, heavy slon backup every AM
> 
>
>
>
>
>
>
>On Wed, Aug 3, 2016 at 12:30 PM, Jan Wieck <j...@wi3ck.info> wrote:
>
>
>>
>>
>>
>>On Wed, Aug 3, 2016 at 11:14 AM, Tory M Blue <tmb...@gmail.com> wrote:
>>
>>Hey folks
>>>
>>>Running into some more issues and i'm not finding a definitive tuning guide 
>>>nor can I tell why i'm backing up and why it takes so long to catch up (so 
>>>long is relative).
>>>
>>>
>>>Running Slony 2.2.3 (but this has been an issue for a long time).
>>>
>>>
>>>Image to show you where I am this AM , but this is a constant every AM.  I 
>>>have large updates about 2.5 million between 12am and 4am, the job is done 
>>>but it will take Slon a couple of hours to catch up.
>>
>>
>>Are there long running transactions that start around the time, the sl_log 
>>starts
>>growing? Any long running transaction prevents a log switch from finishing. 
>>
>>
>>
>>
>>Jan
>
>
>Hi Jan,
>
>
>There are really not many long queries during the big import, but things start 
>getting longer when the slon logs have 10m+ rows in them.
>
>
>I actually log any transaction that takes over a second and there is nothing 
>that stands out, nothing that runs all that long. During this period, I move 
>my reporting to point to the master, because of the delay in the standby, so 
>there are some longer report queries, but the longest is 16 minutes 
>


Do you see any long running copy statements from the slon on the subscriber?  
If so one avenue to investigate is if there are any replicated tables with 
sub-optimal indexes on them.

I mention it purely because I've been bitten by this myself and seen very 
similar symptoms to what you're reporting here.  There's every chance this is 
way off the mark, but I think it's worth mentioning.

The issue I saw was down to a bunch of pretty ugly indexes created by an 
application, and the result was inaccurate selectivity estimates causing the 
planner to choose one of the ugly indexes instead of the primary key when 
applying the changes on the subscriber.  Multivariate planner statistics that 
PostgreSQL doesn't yet have might have mitigated this, but that's besides the 
point.


For example I'd see something like 200,000 records getting updated by a single 
update statement on the origin taking about 20 seconds, but when applied as 
200,000 separate update statements on the subscriber it took more like 200 
minutes.  An update using an index scan on the ugly index would take 62ms vs 
0.1ms on the primary key, but outwardly all I saw until I enabled autoexplain 
was a long running copy.

Glyn
_______________________________________________
Slony1-general mailing list
Slony1-general@lists.slony.info
http://lists.slony.info/mailman/listinfo/slony1-general

Reply via email to