Re: ManifoldCF + Postgresql - long freeze on job

2019-02-11 Thread Karl Wright
There is not such a specific value. But you can practically disable this entirely by setting a very large value, e.g. 20. Karl On Mon, Feb 11, 2019 at 7:43 AM LIROT Daniel - SG/SPSSI/CPII/DOSO/ET < daniel.li...@developpement-durable.gouv.fr> wrote: > Hi, > > We see the table "Advanced

Re: ManifoldCF + Postgresql - long freeze on job

2019-02-11 Thread LIROT Daniel - SG/SPSSI/CPII/DOSO/ET
Hi, We see the table "Advanced properties.xml properties", we use it to parametrized : "name="org.apache.manifoldcf.db.postgres.reindex.intrinsiclink" value="500" />" for the intrinsiclink table, and we do the same for the other tables, but is there a value that allows to disable the

Re: ManifoldCF + Postgresql - long freeze on job

2019-02-11 Thread Karl Wright
See: https://manifoldcf.apache.org/release/release-1.10/en_US/how-to-build-and-deploy.html#file+properties Look at the table "Advanced properties.xml properties" Karl On Mon, Feb 11, 2019 at 4:16 AM LIROT Daniel - SG/SPSSI/CPII/DOSO/ET < daniel.li...@developpement-durable.gouv.fr> wrote: >

Re: ManifoldCF + Postgresql - long freeze on job

2019-02-11 Thread LIROT Daniel - SG/SPSSI/CPII/DOSO/ET
Hello, 1/ The database we use is Postgresql version 9.6 2/ I will look at what is happening about the queries in the logs. 3/ We do a vacuum full analyse every 24 hours, for each table we adjust the reindex at the value 500 (in properties.xml) with the line :

Re: ManifoldCF + Postgresql - long freeze on job

2019-02-08 Thread Karl Wright
Hello, (1) What database are you using for this? Some databases require maintenance periodically or have other heavy usage constraints. (2) Every time a query takes more than an minute to execute, it is logged, along with the query plan. You need to look at the manifoldcf log to see which

ManifoldCF + Postgresql - long freeze on job

2019-02-08 Thread LIROT Daniel - SG/SPSSI/CPII/DOSO/ET
Hello, We use ManifoldCF v2.10, with postgresql (9.6) to crawl our websites. this represents approximately 1.2 million documents. We split the crawl into 4 jobs that distribute their results on 3 SOLR collections. The crawl is powerful up to 50 documents (25000 to 3 docs / hour) then