would this be the right place? https://cwiki.apache.org/confluence/display/CONNECTORS/FAQ
Am 16.12.2017 um 15:08 schrieb Karl Wright: > We should write up your findings in a FAQ if you find MySQL to perform > better than postgresql. > > My guess is that what you are seeing is less "bad plan", more > "contention and backoff", but that would take some effort to tease out. > > Karl > > On Fri, Dec 15, 2017 at 3:06 PM, Karl Wright <daddy...@gmail.com > <mailto:daddy...@gmail.com>> wrote: > > Interesting. Thanks for the update. > Karl > > On Fri, Dec 15, 2017 at 2:31 PM, Markus Schuch > <markus.sch...@deutschebahn.com > <mailto:markus.sch...@deutschebahn.com>> wrote: > > We were no able to resolve the performance issue with the > carrydown table.____ > > __ __ > > We switched to RDS Aurora MySQL, which performs similar to the > MariaDB database in our on premise datacenter. No problems with > the carrydown table queries of the sharepoint connector.____ > > __ __ > > Regards____ > > Markus____ > > __ __ > > *Von:* Karl Wright [mailto:daddy...@gmail.com > <mailto:daddy...@gmail.com>] > *Gesendet:* Donnerstag, 30. November 2017 12:23 > > > *An:* user@manifoldcf.apache.org <mailto:user@manifoldcf.apache.org> > *Betreff:* Re: Amazon RDS for PostgreSQL Support____ > > __ __ > > Typically Jobs table is short and sequential scans are faster > than index joins. Postgres optimizes for that.____ > > __ __ > > No, the plans look fine. Another reason for the long-running > queries might well be contention and locking -- many threads > will be trying to do similar things at the same time. You will > note that multiple records get updated in one query; this is > usually helpful but when each update is expensive you could wind > up with locking causing delays.____ > > __ __ > > Karl____ > > __ __ > > __ __ > > On Thu, Nov 30, 2017 at 2:33 AM, Markus Schuch > <markus.sch...@deutschebahn.com > <mailto:markus.sch...@deutschebahn.com>> wrote:____ > > Hi Karl,____ > > ____ > > we disabled autovacuum____ > > The stats table show there was no autovac since then.____ > > ____ > > The long running queries still occur.____ > > There are no other apps using the database and no other jobs > running.____ > > ____ > > But there is another long running query to the jobs table > between the carrydown queries.____ > > This query seem to happen at the same time with the long > running carrydown query.____ > > ____ > > The plan output says “Sec Scan on jobs…”. ____ > > ____ > > 2017-11-30 08:16:16,008 WARN [Finisher thread] > org.apache.manifoldcf.db: Found a long-running query (169057 > ms): [SELECT id FROM jobs WHERE (status=? OR status=? OR > status=?) FOR UPDATE]____ > > 2017-11-30 08:16:16,009 WARN [Finisher thread] > org.apache.manifoldcf.db: Parameter 0: 'A'____ > > 2017-11-30 08:16:16,009 WARN [Finisher thread] > org.apache.manifoldcf.db: Parameter 1: 'W'____ > > 2017-11-30 08:16:16,009 WARN [Finisher thread] > org.apache.manifoldcf.db: Parameter 2: 'R'____ > > 2017-11-30 08:16:16,011 WARN [Finisher thread] > org.apache.manifoldcf.db: Plan: LockRows (cost=0.00..4.43 > rows=3 width=14) (actual time=0.022..0.024 rows=1 loops=1)____ > > 2017-11-30 08:16:16,011 WARN [Finisher thread] > org.apache.manifoldcf.db: Plan: -> Seq Scan on jobs > (cost=0.00..4.40 rows=3 width=14) (actual time=0.021..0.022 > rows=1 loops=1)____ > > 2017-11-30 08:16:16,011 WARN [Finisher thread] > org.apache.manifoldcf.db: Plan: Filter: ((status = > 'A'::bpchar) OR (status = 'W'::bpchar) OR (status = > 'R'::bpchar))____ > > 2017-11-30 08:16:16,011 WARN [Finisher thread] > org.apache.manifoldcf.db: Plan: Rows Removed by > Filter: 22____ > > 2017-11-30 08:16:16,011 WARN [Finisher thread] > org.apache.manifoldcf.db: Plan: Planning time: 0.093 ms____ > > 2017-11-30 08:16:16,011 WARN [Finisher thread] > org.apache.manifoldcf.db: Plan: Execution time: 0.041 ms____ > > 2017-11-30 08:16:16,011 WARN [Finisher thread] > org.apache.manifoldcf.db:____ > > ____ > > Jobs table indices:____ > > public jobs jobs_pkey > CREATE UNIQUE INDEX jobs_pkey ON jobs USING btree (id)____ > > public jobs > i1511464305264 CREATE INDEX > i1511464305264 ON jobs USING btree (status, id, priority)____ > > public jobs > i1511464305263 CREATE INDEX > i1511464305263 ON jobs USING btree (status, processid)____ > > public jobs > i1511464305262 CREATE INDEX > i1511464305262 ON jobs USING btree (connectionname)____ > > public jobs > i1511464305261 CREATE INDEX > i1511464305261 ON jobs USING btree (failtime)____ > > ____ > > Shouldn’t be i1511464305264 be used?____ > > ____ > > Many thanks in advance____ > > Markus____ > > ____ > > *Von:* Karl Wright [mailto:daddy...@gmail.com > <mailto:daddy...@gmail.com>] > *Gesendet:* Mittwoch, 29. November 2017 23:28 > *An:* user@manifoldcf.apache.org > <mailto:user@manifoldcf.apache.org> > *Betreff:* Re: Amazon RDS for PostgreSQL Support____ > > ____ > > The plans look good for the carrydown execution, so I have > to conclude that the long-running queries are due to other > considerations -- perhaps concurrent vacuuming, perhaps > other queries pounding the database.____ > > ____ > > Sharepoint is especially hard on the carrydown table -- it > gets huge.____ > > ____ > > Karl____ > > ____ > > ____ > > On Wed, Nov 29, 2017 at 4:46 PM, Markus Schuch > <markus.sch...@deutschebahn.com > <mailto:markus.sch...@deutschebahn.com>> wrote:____ > > Hi,____ > > ____ > > since nobody responded, we started to experiment.____ > > ____ > > Setup:____ > > AWS RDS Postgres 9.6.3____ > > db.m4.xlarge (4 cores, 16 GB RAM)____ > > provisioned iops 4000____ > > ____ > > Indexing Performance for most jobs is pretty good, > except a for a bigger sharepoint crawl (~200.000 docs).____ > > ____ > > We are seeing a lot of long running queries for the > tables carrydown here.____ > > ____ > > Logfile:____ > > > https://gist.githubusercontent.com/schuch/b553dc3f825e0c640a52ce3f530ec3e0/raw/34e493ef520d2272b25f7d9d8acc3f8417550c17/gistfile1.txt > > <https://gist.githubusercontent.com/schuch/b553dc3f825e0c640a52ce3f530ec3e0/raw/34e493ef520d2272b25f7d9d8acc3f8417550c17/gistfile1.txt>____ > > ____ > > Settings:____ > > > https://gist.githubusercontent.com/schuch/44fcbb312df37ab89a910024a56ff835/raw/eba6ff9bc82026f181d5687584e9dd99cdffb242/postgresql%2520RDS%2520settings > > <https://gist.githubusercontent.com/schuch/44fcbb312df37ab89a910024a56ff835/raw/eba6ff9bc82026f181d5687584e9dd99cdffb242/postgresql%2520RDS%2520settings>____ > > (due to this discussion we left autovac on: > > https://www.mail-archive.com/user@manifoldcf.apache.org/msg03512.html > > <https://www.mail-archive.com/user@manifoldcf.apache.org/msg03512.html>)____ > > ____ > > CPU Usage of the Database is between 15% and 35%____ > > ____ > > Are we hitting bad auto optimiziations of the newer > postgresql version?____ > > ____ > > Many thanks in advance,____ > > Markus____ > > ____ > > __ __ > > >