Hello,
1/ The database we use is Postgresql version 9.6
2/ I will look at what is happening about the queries in the logs.
3/ We do a vacuum full analyse every 24 hours, for each table we adjust
the reindex at the value 5000000 (in properties.xml) with the line :
<property
name="org.apache.manifoldcf.db.postgres.reindex.intrinsiclink"
value="5000000" />
Is there an instruction that allows to disable the reindex requested by
manifoldcf
thanks
Daniel
Le 08/02/2019 à 16:00, > Karl Wright (par Internet, dépôt
user-return-5674-daniel.lirot=developpement-durable.gouv...@manifoldcf.apache.org)
a écrit :
Hello,
(1) What database are you using for this? Some databases require
maintenance periodically or have other heavy usage constraints.
(2) Every time a query takes more than an minute to execute, it is
logged, along with the query plan. You need to look at the manifoldcf
log to see which queries are problematic before concluding anything.
(3) For every database table, you can individually configure how many
table operations approximately occur before MCF re-analyzes the
table. However, it's likely that you have the opposite problem: a bad
query plan for the query that queues documents for processing. That
may mean more frequent analysis to prevent. But we cannot tell that
until we understand what queries are taking a long time.
Thanks,
Karl
On Fri, Feb 8, 2019 at 8:07 AM LIROT Daniel - SG/SPSSI/CPII/DOSO/ET
<daniel.li...@developpement-durable.gouv.fr
<mailto:daniel.li...@developpement-durable.gouv.fr>> wrote:
Hello,
We use ManifoldCF v2.10, with postgresql (9.6) to crawl our websites.
this represents approximately 1.2 million documents.
We split the crawl into 4 jobs that distribute their results on 3
SOLR collections.
The crawl is powerful up to 500000 documents (25000 to 30000 docs
/ hour) then the performance decreases strongly in progress, we
observe freezes very very long, you might think that the crawl is
stopped.
We suspect a reindexing, noticeably of the intrinsiclink table
which is very important 85 Million lines.
Is it possible to prohibit re-indexing controlled by manifoldCF?
An other idea ?
best Regards
LIROT daniel
--