Hi All,
When i am trying to execute bash command inside manifoldcf container
getting error.
[image: image.png]
And when checking logs Sudo docker logs <CID>
2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to stream
logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,848 Job notification thread ERROR Unable to write to
stream logs/manifoldcf.log for appender MyFile
2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
processing Appender MyFile org
.apache.logging.log4j.core.appender.AppenderLoggingException: Error
flushing stream logs/manifoldcf.log
at
org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159)
Can any body suggest reason behind this error?
Thanks
Priya
On Fri, Dec 20, 2019 at 3:37 PM Priya Arora <[email protected]> wrote:
> Hi Markus,
>
> Many thanks for your reply!!.
>
> I tried this approach to reproduce the scenario in a different
> environment, but the case where I listed the error above is when I am
> crawling INTRANET sites which can be accessible over a remote server. Also
> I have used Transformation connectors:-Allow Documents, Tika Parser,
> Content Limiter( 10000000), Metadata Adjuster.
>
> When tried reproducing the error with Public sites of the same domain and
> on a different server(DEV), it was successful, with no error.Also there was
> no any postgres related error.
>
> Can it depends observer related configurations like Firewall etc, as this
> case include some firewall,security related configurations.
>
> Thanks
> Priya
>
>
>
>
> On Fri, Dec 20, 2019 at 3:23 PM Markus Schuch <[email protected]>
> wrote:
>
>> Hi Priya,
>>
>> in my experience, i would focus on the OutOfMemoryError (OOME).
>> 8 Gigs can be enough, but they don't have to.
>>
>> At first i would check if the jvm is really getting the desired heap
>> size. The dockered environment make that a little harder find find out,
>> since you need to get access to the jvm metrics, e.g. via jmxremote.
>> Beeing able to monitor the jvm metrics helps you with correlating the
>> errors with the heap and garbage collection activity.
>>
>> The errors you see on postgresql jdbc driver might be very related to
>> the OOME.
>>
>> Some question i would ask myself:
>>
>> Do the problems repeatingly occur only when crawling this specific
>> content source or only with this specific output connection? Can you
>> reproduce it outside of docker in a controlled dev environment? Or is it
>> a more general problem with your manifoldcf instance?
>>
>> May be there are some huge files beeing crawled in your content source?
>> To you have any kind of transformations configured? (e.g. content size
>> limit?) You should try to see in the job's history if there are any
>> patterns, like the error rises always after encountering the same
>> document xy.
>>
>> Cheers
>> Markus
>>
>>
>>
>> Am 20.12.2019 um 09:59 schrieb Priya Arora:
>> > Hi Markus ,
>> >
>> > Heap size defined is 8GB. Manifoldcf start-options-unix file Xmx etc
>> > parameters is defined to have memory 8192mb.
>> >
>> > It seems to be an issue with memory also, and also when manifoldcf tries
>> > to communicate to Database. Do you explicitly define somewhere
>> > connection timer when to communicate to postgres.
>> > Postgres is installed as a part of docker image pull and then some
>> > changes in properties.xml(of manifoldcf) to connect to database.
>> > On the other hand Elastic search is also holding sufficient memory and
>> > Manifoldcf is also provided with 8 cores CPU.
>> >
>> > Can you suggest some solution.
>> >
>> > Thanks
>> > Priya
>> >
>> > On Fri, Dec 20, 2019 at 2:23 PM Markus Schuch <[email protected]
>> > <mailto:[email protected]>> wrote:
>> >
>> > Hi Priya,
>> >
>> > your manifoldcf JVM suffers from high garbage collection pressure:
>> >
>> > java.lang.OutOfMemoryError: GC overhead limit exceeded
>> >
>> > What is your current heap size?
>> > Without knowing that, i suggest to increase the heap size. (java
>> > -Xmx...)
>> >
>> > Cheers,
>> > Markus
>> >
>> > Am 20.12.2019 um 09:02 schrieb Priya Arora:
>> > > Hi All,
>> > >
>> > > I am facing below error while accessing Manifoldcf. Requirement
>> is to
>> > > crawl data from a website using Repository as "Web" and Output
>> > connector
>> > > as "Elastic Search"
>> > > Manifoldcf is configured inside a docker container and also
>> > postgres is
>> > > used a docker container.
>> > > When launching manifold getting below error
>> > > image.png
>> > >
>> > > When checked logs:-
>> > > *1)sudo docker exec -it 0b872dfafc5c tail -1000
>> > > /usr/share/manifoldcf/example/logs/manifoldcf.log*
>> > > FATAL 2019-12-20T06:06:13,176 (Stuffer thread) - Error tossed:
>> Timer
>> > > already cancelled.
>> > > java.lang.IllegalStateException: Timer already cancelled.
>> > > at java.util.Timer.sched(Timer.java:397) ~[?:1.8.0_232]
>> > > at java.util.Timer.schedule(Timer.java:193) ~[?:1.8.0_232]
>> > > at
>> > >
>> org.postgresql.jdbc.PgConnection.addTimerTask(PgConnection.java:1113)
>> > > ~[postgresql-42.1.3.jar:42.1.3]
>> > > at
>> > > org.postgresql.jdbc.PgStatement.startTimer(PgStatement.java:887)
>> > > ~[postgresql-42.1.3.jar:42.1.3]
>> > > at
>> > >
>> org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:427)
>> > > ~[postgresql-42.1.3.jar:42.1.3]
>> > > at
>> > org.postgresql.jdbc.PgStatement.execute(PgStatement.java:354)
>> > > ~[postgresql-42.1.3.jar:42.1.3]
>> > > at
>> > >
>> >
>>
>> org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:169)
>> > > ~[postgresql-42.1.3.jar:42.1.3]
>> > > at
>> > >
>> >
>>
>> org.postgresql.jdbc.PgPreparedStatement.executeUpdate(PgPreparedStatement.java:136)
>> > > ~[postgresql-42.1.3.jar:42.1.3]
>> > > at
>> > > org.postgresql.jdbc.PgConnection.isValid(PgConnection.java:1311)
>> > > ~[postgresql-42.1.3.jar:42.1.3]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.jdbcpool.ConnectionPool.getConnection(ConnectionPool.java:92)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.database.ConnectionFactory.getConnectionWithRetries(ConnectionFactory.java:126)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.database.ConnectionFactory.getConnection(ConnectionFactory.java:75)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.database.Database.executeUncachedQuery(Database.java:797)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.database.Database$QueryCacheExecutor.create(Database.java:1457)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.cachemanager.CacheManager.findObjectsAndExecute(CacheManager.java:146)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>> org.apache.manifoldcf.core.database.Database.executeQuery(Database.java:204)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.database.DBInterfacePostgreSQL.performQuery(DBInterfacePostgreSQL.java:837)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.core.database.BaseTable.performQuery(BaseTable.java:221)
>> > > ~[mcf-core.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.crawler.jobs.Jobs.getActiveJobConnections(Jobs.java:736)
>> > > ~[mcf-pull-agent.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.crawler.jobs.JobManager.getNextDocuments(JobManager.java:2869)
>> > > ~[mcf-pull-agent.jar:?]
>> > > at
>> > >
>> >
>>
>> org.apache.manifoldcf.crawler.system.StufferThread.run(StufferThread.java:186)
>> > > [mcf-pull-agent.jar:?]
>> > > *2)sudo docker logs <CID> --tail 1000*
>> > > Exception in thread "PostgreSQL-JDBC-SharedTimer-1"
>> > > java.lang.OutOfMemoryError: GC overhead limit exceeded
>> > > at java.util.ArrayList.iterator(ArrayList.java:840)
>> > > at
>> > >
>> >
>> java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1316)
>> > > at
>> java.net.InetAddress.getAllByName0(InetAddress.java:1277)
>> > > at
>> java.net.InetAddress.getAllByName(InetAddress.java:1193)
>> > > at
>> java.net.InetAddress.getAllByName(InetAddress.java:1127)
>> > > at java.net.InetAddress.getByName(InetAddress.java:1077)
>> > > at
>> > java.net.InetSocketAddress.<init>(InetSocketAddress.java:220)
>> > > at org.postgresql.core.PGStream.<init>(PGStream.java:66)
>> > > at
>> > >
>> >
>>
>> org.postgresql.core.QueryExecutorBase.sendQueryCancel(QueryExecutorBase.java:155)
>> > > at
>> > >
>> org.postgresql.jdbc.PgConnection.cancelQuery(PgConnection.java:971)
>> > > at
>> > org.postgresql.jdbc.PgStatement.cancel(PgStatement.java:812)
>> > > at
>> org.postgresql.jdbc.PgStatement$1.run(PgStatement.java:880)
>> > > at java.util.TimerThread.mainLoop(Timer.java:555)
>> > > at java.util.TimerThread.run(Timer.java:505)
>> > > 2019-12-19 18:09:05,848 Job start thread ERROR Unable to write to
>> > stream
>> > > logs/manifoldcf.log for appender MyFile
>> > > 2019-12-19 18:09:05,848 Seeding thread ERROR Unable to write to
>> stream
>> > > logs/manifoldcf.log for appender MyFile
>> > > 2019-12-19 18:09:05,848 Job reset thread ERROR Unable to write to
>> > stream
>> > > logs/manifoldcf.log for appender MyFile
>> > > 2019-12-19 18:09:05,848 Job notification thread ERROR Unable to
>> > write to
>> > > stream logs/manifoldcf.log for appender MyFile
>> > > 2019-12-19 18:09:05,849 Seeding thread ERROR An exception occurred
>> > > processing Appender MyFile
>> > > org.apache.logging.log4j.core.appender.AppenderLoggingException:
>> Error
>> > > flushing stream logs/manifoldcf.log
>> > > at
>> > >
>> >
>>
>> org.apache.logging.log4j.core.appender.OutputStreamManager.flush(OutputStreamManager.java:159).
>> > >
>> > > _Also tried the approach to clean up Database by truncating all
>> > > manifoldcf related tables, but still getting this error._
>> > >
>> > > Parameters defined in *postgresql conf *file is as suggested :-
>> and
>> > > "max_pred_per_locks_transctions" is set to value "256".
>> > > image.png
>> >
>>
>