Yes , I am using Tika Extractor. And the version used for manifold is 2.13. Also I am using postgres as database.
I have 4 types of jobs One is accessing/re crawling data from a public site. Other three are accessing intranet site. Out of which two are giving me correct output-without any error and third one which is having data more than the other two , and giving me this error. Is there any possibility with site accessibility issue. Can you please suggest some solution Thanks and regards Priya On Wed, Aug 14, 2019 at 3:11 PM Karl Wright <daddy...@gmail.com> wrote: > I will need to know more. Do you have the tika extractor in your > pipeline? If so, what version of ManifoldCF are you using? Tika has had > bugs related to memory consumption in the past; the out of memory exception > may be coming from it and therefore a stack trace is critical to have. > > Alternatively, you can upgrade to the latest version of MCF (2.13) and that > has a newer version of Tika without those problem. But you may need to get > the agents process more memory. > > Another possible cause is that you're using hsqldb in production. HSQLDB > keeps all of its tables in memory. If you have a large crawl, you do not > want to use HSQLDB. > > Thanks, > Karl > > > On Wed, Aug 14, 2019 at 3:41 AM Priya Arora <pr...@smartshore.nl> wrote: > > > Hi Karl, > > > > Manifold CF logs hints out me an error like : > > agents process ran out of memory - shutting down > > java.lang.OutOfMemoryError: Java heap space > > > > Also I have -Xms1024m ,-Xmx1024m memory allocated in > > start-options.env.unix, start-options.env.win file. > > Also Configuration:- > > 1) For Crawler server - 16 GB RAM and 8-Core Intel(R) Xeon(R) CPU E5-2660 > > v3 @ 2.60GHz and > > > > 2) For Elasticsearch server - 48GB and 1-Core Intel(R) Xeon(R) CPU > E5-2660 > > v3 @ 2.60GHz and i am using postgres as database. > > > > Can you please help me out, what to do in this case. > > > > Thanks > > Priya > > > > > > On Wed, Aug 14, 2019 at 12:33 PM Karl Wright <daddy...@gmail.com> wrote: > > > > > The error occurs, I believe, as the result of basic connection > problems, > > > e.g. the connection is getting rejected. You can find more information > > in > > > the simple history, and in the manifoldcf log. > > > > > > I would like to know the underlying cause, since the connector should > be > > > resilient against errors of this kind. > > > > > > Karl > > > > > > > > > On Wed, Aug 14, 2019, 1:46 AM Priya Arora <pr...@smartshore.nl> wrote: > > > > > > > Hi Karl, > > > > > > > > I have an web Repository connector(Seeds:- an intranet Site)., and > job > > i > > > > son Production server. > > > > > > > > When i ran job on PROD, the job stops itself 2 times with and > > > error:Error: > > > > Unexpected HTTP result code: -1: null. > > > > > > > > > > > > Can you please provide me an idea, in which it happens so? > > > > > > > > Thanks and regards > > > > Priya Arora > > > > > > > > > >