Without an out-of-memory stack trace, I cannot definitively point to Tika
or say that it's a specific kind of file.  Please send one.

Karl


On Fri, Aug 16, 2019 at 2:09 AM Priya Arora <pr...@smartshore.nl> wrote:

> *Existing Threads/connections configuration is :-*
>
> How many worker threads do you have? - 15 worker threads has been
> allocated(in properties.xml file).
> And the Tika Extractor connections -10 connections are defined.
>
> Is this suggested to reduce the number more.
> If not, what else can be a solution
>
> Thanks
> Priya
>
>
>
> On Wed, Aug 14, 2019 at 5:32 PM Karl Wright <daddy...@gmail.com> wrote:
>
> > How many worker threads do you have?
> > Even if each worker thread is constrained in memory, and they should be,
> > you can easily cause things to run out of memory by giving too many
> worker
> > threads.  Another way to keep Tika's usage constrained would be to reduce
> > the number of Tika Extractor connections, because that effectively limits
> > the number of extractions that can be going on at the same time.
> >
> > Karl
> >
> >
> > On Wed, Aug 14, 2019 at 7:23 AM Priya Arora <pr...@smartshore.nl> wrote:
> >
> > > Yes , I am using Tika Extractor. And the version used for manifold is
> > 2.13.
> > > Also I am using postgres as database.
> > >
> > > I have 4 types of jobs
> > > One is accessing/re crawling data from a public site. Other three are
> > > accessing intranet site.
> > > Out of which two are giving me correct output-without any error and
> third
> > > one which is having data more than the other two , and  giving me this
> > > error.
> > >
> > > Is there any possibility with site accessibility issue. Can you please
> > > suggest some solution
> > > Thanks and regards
> > > Priya
> > >
> > > On Wed, Aug 14, 2019 at 3:11 PM Karl Wright <daddy...@gmail.com>
> wrote:
> > >
> > > > I will need to know more.  Do you have the tika extractor in your
> > > > pipeline?  If so, what version of ManifoldCF are you using?  Tika has
> > had
> > > > bugs related to memory consumption in the past; the out of memory
> > > exception
> > > > may be coming from it and therefore a stack trace is critical to
> have.
> > > >
> > > > Alternatively, you can upgrade to the latest version of MCF (2.13)
> and
> > > that
> > > > has a newer version of Tika without those problem.  But you may need
> to
> > > get
> > > > the agents process more memory.
> > > >
> > > > Another possible cause is that you're using hsqldb in production.
> > HSQLDB
> > > > keeps all of its tables in memory.  If you have a large crawl, you do
> > not
> > > > want to use HSQLDB.
> > > >
> > > > Thanks,
> > > > Karl
> > > >
> > > >
> > > > On Wed, Aug 14, 2019 at 3:41 AM Priya Arora <pr...@smartshore.nl>
> > wrote:
> > > >
> > > > > Hi Karl,
> > > > >
> > > > > Manifold CF logs hints out me an error like :
> > > > > agents process ran out of memory - shutting down
> > > > > java.lang.OutOfMemoryError: Java heap space
> > > > >
> > > > > Also I have -Xms1024m ,-Xmx1024m memory allocated in
> > > > > start-options.env.unix, start-options.env.win file.
> > > > > Also Configuration:-
> > > > > 1) For Crawler server - 16 GB RAM and 8-Core Intel(R) Xeon(R) CPU
> > > E5-2660
> > > > > v3 @ 2.60GHz and
> > > > >
> > > > > 2) For Elasticsearch server - 48GB and 1-Core Intel(R) Xeon(R) CPU
> > > > E5-2660
> > > > > v3 @ 2.60GHz and i am using postgres as database.
> > > > >
> > > > > Can you please help me out, what to do in this case.
> > > > >
> > > > > Thanks
> > > > > Priya
> > > > >
> > > > >
> > > > > On Wed, Aug 14, 2019 at 12:33 PM Karl Wright <daddy...@gmail.com>
> > > wrote:
> > > > >
> > > > > > The error occurs, I believe, as the result of basic connection
> > > > problems,
> > > > > > e.g. the connection is getting rejected.  You can find more
> > > information
> > > > > in
> > > > > > the simple history, and in the manifoldcf log.
> > > > > >
> > > > > > I would like to know the underlying cause, since the connector
> > should
> > > > be
> > > > > > resilient against errors of this kind.
> > > > > >
> > > > > > Karl
> > > > > >
> > > > > >
> > > > > > On Wed, Aug 14, 2019, 1:46 AM Priya Arora <pr...@smartshore.nl>
> > > wrote:
> > > > > >
> > > > > > > Hi Karl,
> > > > > > >
> > > > > > > I have an web Repository connector(Seeds:- an intranet Site).,
> > and
> > > > job
> > > > > i
> > > > > > > son Production server.
> > > > > > >
> > > > > > > When i ran job on PROD, the job stops itself 2 times with and
> > > > > > error:Error:
> > > > > > > Unexpected HTTP result code: -1: null.
> > > > > > >
> > > > > > >
> > > > > > > Can you please provide me an idea, in which it happens so?
> > > > > > >
> > > > > > > Thanks and regards
> > > > > > > Priya Arora
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to