Hi Priya,

in my experience, i would focus on the OutOfMemoryError (OOME).
8 Gigs can be enough, but they don't have to.

At first i would check if the jvm is really getting the desired heap
size. The dockered environment make that a little harder find find out,
since you need to get access to the jvm metrics, e.g. via jmxremote.
Beeing able to monitor the jvm metrics helps you with correlating the
errors with the heap and garbage collection activity.

The errors you see on postgresql jdbc driver might be very related to
the OOME.

Some question i would ask myself:

Do the problems repeatingly occur only when crawling this specific
content source or only with this specific output connection? Can you
reproduce it outside of docker in a controlled dev environment? Or is it
a more general problem with your manifoldcf instance?

May be there are some huge files beeing crawled in your content source?
To you have any kind of transformations configured? (e.g. content size
limit?) You should try to see in the job's history if there are any
patterns, like the error rises always after encountering the same
document xy.

Cheers
Markus



Am 20.12.2019 um 09:59 schrieb Priya Arora:
> Hi  Markus ,
>
> Heap size defined is 8GB. Manifoldcf start-options-unix file  Xmx etc
> parameters is defined to have memory 8192mb.
>
> It seems to be an issue with memory also, and also when manifoldcf tries
> to communicate to Database. Do you explicitly define somewhere
> connection timer when to communicate to postgres.
> Postgres is installed as a part of docker image pull and then some
> changes in properties.xml(of manifoldcf) to connect to database.
> On the other hand Elastic search is also holding sufficient memory and
> Manifoldcf is also provided with 8 cores CPU.
>
> Can you suggest some solution.
>
> Thanks
> Priya
>
> On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch <[email protected]
> <mailto:[email protected]>> wrote:
>
>     Hi Priya,
>
>     your manifoldcf JVM suffers from high garbage collection pressure:
>
>         java.lang.OutOfMemoryError: GC overhead limit exceeded
>
>     What is your current heap size?
>     Without knowing that, i suggest to increase the heap size. (java
>     -Xmx...)
>
>     Cheers,
>     Markus
>
>     Am 20.12.2019 um 09:02 schrieb Priya Arora:
>     > Hi All,
>     >
>     > I am facing below error while accessing Manifoldcf. Requirement is to
>     > crawl data from a website using Repository as "Web" and Output
>     connector
>     > as "Elastic Search"
>     > Manifoldcf is configured inside a docker container and also
>     postgres is
>     > used a docker container.
>     > When launching manifold getting below error
>     > image.png
>     >
>     > When checked logs:-
>     > *1)sudo docker exec -it 0b872dfafc5c tail -1000
>     > /usr/share/manifoldcf/example/logs/manifoldcf.log*
>     > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed: Timer
>     > already cancelled.
>     > java.lang.IllegalStateException: Timer already cancelled.
>     >         at java.util.Timer.sched(Timer.java:397) ~[?:1.8.0_232]
>     >         at java.util.Timer.schedule(Timer.java:193) ~[?:1.8.0_232]
>     >         at
>     > org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
>     > ~[postgresql-42.1.3.jar:42.1.3]
>     >         at
>     > org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
>     > ~[postgresql-42.1.3.jar:42.1.3]
>     >         at
>     > org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
>     > ~[postgresql-42.1.3.jar:42.1.3]
>     >         at
>     org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
>     > ~[postgresql-42.1.3.jar:42.1.3]
>     >         at
>     >
>     
> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
>     > ~[postgresql-42.1.3.jar:42.1.3]
>     >         at
>     >
>     
> org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
>     > ~[postgresql-42.1.3.jar:42.1.3]
>     >         at
>     > org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
>     > ~[postgresql-42.1.3.jar:42.1.3]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
>     > ~[mcf-core.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.crawler.jobs.Jobs.getActiveJobConnections(Jobs.java:736)
>     > ~[mcf-pull-agent.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2869)
>     > ~[mcf-pull-agent.jar:?]
>     >         at
>     >
>     
> org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
>     > [mcf-pull-agent.jar:?]
>     > *2)sudo docker logs <CID> --tail 1000*
>     > Exception in thread "PostgreSQL-JDBC-SharedTimer-1"
>     > java.lang.OutOfMemoryError: GC overhead limit exceeded
>     >         at java.util.ArrayList.iterator(ArrayList.java:840)
>     >         at
>     >
>     java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
>     >         at java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>     >         at java.net.InetAddress.getAllByName(InetAddress.java:1193)
>     >         at java.net.InetAddress.getAllByName(InetAddress.java:1127)
>     >         at java.net.InetAddress.getByName(InetAddress.java:1077)
>     >         at
>     java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
>     >         at org.postgresql.core.PGStream.<init>(PGStream.java:66)
>     >         at
>     >
>     
> org.postgresql.core.QueryExecutorBase.sendQueryCancel(QueryExecutorBase.java:155)
>     >         at
>     > org.postgresql.jdbc.PgConnection.cancelQuery(PgConnection.java:971)
>     >         at
>     org.postgresql.jdbc.PgStatement.cancel(PgStatement.java:812)
>     >         at org.postgresql.jdbc.PgStatement$1.run(PgStatement.java:880)
>     >         at java.util.TimerThread.mainLoop(Timer.java:555)
>     >         at java.util.TimerThread.run(Timer.java:505)
>     > 2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to
>     stream
>     > logs/manifoldcf.log for appender MyFile
>     > 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to stream
>     > logs/manifoldcf.log for appender MyFile
>     > 2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to
>     stream
>     > logs/manifoldcf.log for appender MyFile
>     > 2019-12-19 18:09:05,848 Job notification thread ERROR Unable to
>     write to
>     > stream logs/manifoldcf.log for appender MyFile
>     > 2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
>     > processing Appender MyFile
>     > org.apache.logging.log4j.core.appender.AppenderLoggingException: Error
>     > flushing stream logs/manifoldcf.log
>     >         at
>     >
>     
> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159).
>     >
>     > _Also tried the approach to clean up Database by truncating all
>     > manifoldcf related tables, but still getting this error._
>     >
>     > Parameters defined in *postgresql conf *file is as suggested :- and
>     > "max_pred_per_locks_transctions" is set to value "256".
>     > image.png
>

Reply via email to