Re: best practices for avoiding OOM for tika docker

Cristian Zamfir Thu, 10 Jun 2021 10:09:20 -0700

Actually maybe this is related:
java -jar ./tika-server-standard-2.0.0-BETA.jar
SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
details.


Tika 1.x had all the logging dependencies. I did not find something in the
changes log about it.



On Thu, Jun 10, 2021 at 6:51 PM Cristian Zamfir <[email protected]>
wrote:

> I see, thanks. Has TIKA_CHILD_JVM_OPTS=-JXmx been replaced by the
> configuration option forkedJvmArgs or do they still both work? Guessing
> that it is fully replaced.
>
> When I switched to a config file for the server I noticed that some of the
> options I can see in the github repo do not seem to work. For instance log
> and includeStack. Have there been changes to these options in master
> compared to when 2.0.0-BETA was released?
>
> java -jar ./tika-server-standard-2.0.0-BETA.jar --config
>  ./tika-server-config.xml
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> org.apache.tika.exception.TikaConfigException: Couldn't find setter:
> setLog for object class org.apache.tika.server.core.TikaServerConfig
> at org.apache.tika.config.ConfigBase.tryToSet(ConfigBase.java:389)
> at org.apache.tika.config.ConfigBase.setParams(ConfigBase.java:222)
> at org.apache.tika.config.ConfigBase.setParams(ConfigBase.java:190)
> at org.apache.tika.config.ConfigBase.configure(ConfigBase.java:432)
> at
> org.apache.tika.server.core.TikaServerConfig.load(TikaServerConfig.java:179)
> at
> org.apache.tika.server.core.TikaServerConfig.load(TikaServerConfig.java:172)
> at
> org.apache.tika.server.core.TikaServerConfig.load(TikaServerConfig.java:128)
> at org.apache.tika.server.core.TikaServerCli.execute(TikaServerCli.java:83)
> at org.apache.tika.server.core.TikaServerCli.main(TikaServerCli.java:66)
>
>
> <properties>
>   <server>
>     <params>
>       <log>info</log>
>       <!-- <includeStack>false</includeStack> -->
>       <forkedJvmArgs>
>         <arg>-Xmx3g</arg>
>       </forkedJvmArgs>
>       <endpoints>
>         <endpoint>status</endpoint>
>         <endpoint>tika</endpoint>
>       </endpoints>
>     </params>
>   </server>
> </properties>
>
> Thanks,
> Cristi
>
> On Thu, Jun 10, 2021 at 5:36 PM Tim Allison <[email protected]> wrote:
>
>> I just updated the wiki.  I haven't put in an anchor yet, but see:
>> https://cwiki.apache.org/confluence/display/TIKA/TikaServer and search
>> for 'status' at the bottom of the page.
>>
>> Please let us know if you have any questions.
>>
>> Best,
>>
>>               Tim
>>
>>
>> On Thu, Jun 10, 2021 at 11:22 AM Cristian Zamfir <[email protected]>
>> wrote:
>> >
>> > It appears that the -status option was dropped in 2.x - was it replaced
>> by something else?
>> >
>> > Thanks,
>> > Cristi
>> >
>> >
>> > On Wed, Jun 2, 2021 at 4:54 PM Tim Allison <[email protected]> wrote:
>> >>
>> >> >I wanted to double check that -JXX:+ExitOnOutOfMemoryError should be
>> provided to the main process or to the child, can you please confirm?
>> >>
>> >> Yes
>> >>
>> >> On Wed, Jun 2, 2021 at 10:49 AM Cristian Zamfir <[email protected]>
>> wrote:
>> >> >
>> >> >
>> >> >
>> >> > > On 2 Jun 2021, at 15:33, Cristian Zamfir <[email protected]>
>> wrote:
>> >> > >
>> >> > >>
>> >> > >> On 2 Jun 2021, at 14:43, Tim Allison <[email protected]> wrote:
>> >> > >>
>> >> > >>> I noticed that Tika prints in the logs OOM (null), but seems to
>> recover by itself even when not using -spawnChild. Is this the expected
>> behavior?
>> >> > >>
>> >> > >> When not in -spawnChild mode, Tika is catching OOM exceptions
>> (when it
>> >> > >> can), but it isn't "recovering"... the jvm may be in an
>> inconsistent
>> >> > >> state, and it is safest to restart the jvm.  It would probably be
>> good
>> >> > >> practice when in -spawnChild mode to use
>> -XX:+ExitOnOutOfMemoryError,
>> >> > >> or on the tika commandline -JXX:+ExitOnOutOfMemoryError.
>> >> > >
>> >> > > Thanks for the clarification, makes sense. Already migrated to
>> using -spawnChild.
>> >> > > It would be great to make these default for the docker container,
>> I suspect most people using the docker image will use it in a similar way
>> and can run into OOM.
>> >> >
>> >> > I tested that these args work:
>> >> > -spawnChild TIKA_CHILD_JVM_OPTS=-JXmx3g -JXX:+ExitOnOutOfMemoryError
>> -status
>> >> > I wanted to double check that -JXX:+ExitOnOutOfMemoryError should be
>> provided to the main process or to the child, can you please confirm?
>> >> >
>> >> >
>> >> > >
>> >> > >
>> >> > >>
>> >> > >> I highly encourage you to use -spawnChild mode, or the new pipes
>> >> > >> modules in 2.x if those will work for you at some point...those
>> are
>> >> > >> still beta.  OOMs are one thing, but infinite loops are another.
>> >> > >>
>> >> > >> 1. Do you have a recommendation for a stress test that would
>> allow me
>> >> > >> to easily test OOM behavior?
>> >> > >> The MockParser is built for exactly this:
>> >> > >>
>> https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/TIKA/MockParser&source=gmail-imap&ust=1623242610000000&usg=AOvVaw0Ojll8dNTrN4dPDsFxe27I
>> >> > >>
>> >> > >> Let us know if you have any questions about it.  The key elements
>> for
>> >> > >> you are <fakeload/>, <throw/> <oom/> and probably <system_exit/>.
>> >> > >> That's for synthetic load testing.  If you want files in the
>> wild, we
>> >> > >> have 2TB of files from the wild:
>> >> > >>
>> https://www.google.com/url?q=https://corpora.tika.apache.org/base/docs/&source=gmail-imap&ust=1623242610000000&usg=AOvVaw1GmwQmaIG2r-uVROvKDdeJ
>> >> > >
>> >> > > Looks great. Looks like I will need to tweak the container for
>> testing this, but that’s likely fine.
>> >> >
>> >> > Actually I tested that the server restarts on OOM using ulimit and
>> then a for loop with curl, it is easy to reproduce.
>> >> >
>> >> > >
>> >> > >>
>> >> > >> 2. For implementing a health check that detects when Tika is
>> stuck, I
>> >> > >> could periodically send a simple request and check that the reply
>> is
>> >> > >> correct, do you recommend a better approach?
>> >> > >> We have a rudimentary /status endpoint, which will give you
>> number of
>> >> > >> restarts, number of files processed, milliseconds since last
>> parse.
>> >> > >> You have to turn it on via the commandline: -status.
>> >> > >
>> >> > > The status endpoint looks like a possible, I can look for
>> "status": “OPERATING”. Sending a single byte file looks like a decent check
>> as well.
>> >> > >
>> >> > >
>> >> > >
>> >> > >>
>> >> > >> On Wed, Jun 2, 2021 at 6:50 AM Cristian Zamfir <
>> [email protected]> wrote:
>> >> > >>>
>> >> > >>> Hi!
>> >> > >>>
>> >> > >>> I noticed that Tika prints in the logs OOM (null), but seems to
>> recover by itself even when not using -spawnChild. Is this the expected
>> behavior? I am trying to figure out when logs containing "OOM" are critical
>> and would require a container restart.
>> >> > >>>
>> >> > >>> I also wanted to bring up two of my questions below, I am
>> looking forward to your feedback:
>> >> > >>> 1. Do you have a recommendation for a stress test that would
>> allow me to easily test OOM behavior?
>> >> > >>> 2. For implementing a health check that detects when Tika is
>> stuck, I could periodically send a simple request and check that the reply
>> is correct, do you recommend a better approach?
>> >> > >>>
>> >> > >>> Thanks,
>> >> > >>> Cristi
>> >> > >>>
>> >> > >>> On Sat, May 29, 2021 at 2:58 PM Cristian Zamfir <
>> [email protected]> wrote:
>> >> > >>>>
>> >> > >>>>
>> >> > >>>>> On 28 May 2021, at 19:03, Tim Allison <[email protected]>
>> wrote:
>> >> > >>>>>
>> >> > >>>>> Tika 2.x should help with this in pipes and async.  Your
>> system should
>> >> > >>>>> expect to go oom or crash at some point if you're processing
>> enough
>> >> > >>>>> files.
>> >> > >>>>
>> >> > >>>> I believe that this is what is happening in my case, it’s not
>> due to a single file, it happens under high load when processing many files
>> at once.
>> >> > >>>>
>> >> > >>>>>
>> >> > >>>>> Right --spawnChild is not default in 1.x, but it will be in
>> 2.x.  And,
>> >> > >>>>> yes, you should be using it. To set the Xmx in the forked
>> process add
>> >> > >>>>> -J, as in -JXmx2g would set the Xmx for the forked process.
>> >> > >>>>
>> >> > >>>>
>> >> > >>>> Did both now and I think this provides good recovery from OOM.
>> >> > >>>>
>> >> > >>>>
>> >> > >>>>>
>> >> > >>>>> I don't have experience to recommend bumping Xmx to close to
>> your
>> >> > >>>>> container's max memory. In java programs that do a bunch of
>> work off
>> >> > >>>>> heap, this would be a bad idea because you need to leave
>> resources for
>> >> > >>>>> your system os, but I don't think we do much off heap.
>> >> > >>>>
>> >> > >>>> What’s your take on a configuration in which the container is
>> capped at 4GB and the spawned child has a heap limit of 3GB? Sounds like a
>> pretty safe margin to me.
>> >> > >>>>
>> >> > >>>>>
>> >> > >>>>> Which file types are causing OOMs?  The MP4Parser is
>> notorious, and
>> >> > >>>>> we're looking to swap it out in 2.x for a different parser.
>> >> > >>>>
>> >> > >>>> Good to hear. I don’t know how to identify the root cause
>> because there are many files sent at once.
>> >> > >>>> However, it would be great to learn if there is a quick way to
>> trigger a high load and test resiliency to OOM, do you have a
>> recommendation?
>> >> > >>>>
>> >> > >>>>
>> >> > >>>>>
>> >> > >>>>> Yep, TIKA-3353 is the monitoring that Nick was mentioning.
>> >> > >>>>
>> >> > >>>> I am actually more interested in health checks, to detect when
>> the system is stuck without automatically restarting. A built-in health
>> check would certainly be a nice feature.
>> >> > >>>>
>> >> > >>>> Besides OOM, one other possible cause is if /tmp gets full -
>> for instance I see here
>> https://www.google.com/url?q=https://github.com/tongwang/tika-server-docker/blob/master/bin/healthcheck&source=gmail-imap&ust=1623242610000000&usg=AOvVaw3ELoyR3KnlYeRkxqI-n_sp
>> that /tmp is cleaned up periodically and the health check fails if it is
>> too full.
>> >> > >>>>
>> >> > >>>> Are there any other situations that could indicate that the
>> container is stuck and needs a restart and if yes, is there a way to detect
>> the condition?
>> >> > >>>>
>> >> > >>>> Thanks,
>> >> > >>>> Cristi
>> >> > >>>>
>> >> > >>>>>
>> >> > >>>>> On Fri, May 28, 2021 at 9:08 AM Cristian Zamfir <
>> [email protected]> wrote:
>> >> > >>>>>>
>> >> > >>>>>> Thanks for your answer Nick!
>> >> > >>>>>>
>> >> > >>>>>> I am running apache/tika:latest-full which is using 1.25.
>> Looks like I need at least version 1.26 for
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://issues.apache.org/jira/browse/TIKA-3353%26source%3Dgmail-imap%26ust%3D1622826254000000%26usg%3DAOvVaw1we1l0Sh-gWif4FqbZ2qek&source=gmail-imap&ust=1623242610000000&usg=AOvVaw1_cxA4lC8qLoQsbu4sQfsP,
>> but I am not sure if this is not overkill for implementing basic liveness
>> health checks.
>> >> > >>>>>>
>> >> > >>>>>> It's clear that –spawnChild and ForkParser are two must-haves
>> that AFAIU are not default in apache/tika:latest-full
>> >> > >>>>>>
>> >> > >>>>>> My guess is that I also need to set the jvm heap size close
>> to the memory resource limit for the container, but that's not ideal
>> because the heap size would be statically configured while the memory
>> resource limits are dynamic. Or maybe this is not necessary if I use
>> -spawnChild?
>> >> > >>>>>>
>> >> > >>>>>> I am looking forward to your answers, thanks a lot!
>> >> > >>>>>>
>> >> > >>>>>> Cristi
>> >> > >>>>>>
>> >> > >>>>>>
>> >> > >>>>>> On Fri, May 28, 2021 at 2:55 PM Nick Burch <
>> [email protected]> wrote:
>> >> > >>>>>>>
>> >> > >>>>>>> On Thu, 27 May 2021, Cristian Zamfir wrote:
>> >> > >>>>>>>> I am running some stress tests of the latest tika server
>> docker (not
>> >> > >>>>>>>> modified in any way, just pulled from the registry) and
>> seeing that after a
>> >> > >>>>>>>> few hours I see OOM in the logs. The container has a limit
>> of 4GB set in
>> >> > >>>>>>>> K8S. I am wondering if you have any best practices on how
>> to avoid this.
>> >> > >>>>>>>
>> >> > >>>>>>> Hopefully one of our Tika+Docker experts will be along in a
>> minute to help
>> >> > >>>>>>> advise!
>> >> > >>>>>>>
>> >> > >>>>>>> For now, the general advice is documented at:
>> >> > >>>>>>>
>> https://www.google.com/url?q=https://www.google.com/url?q%3Dhttps://cwiki.apache.org/confluence/display/TIKA/The%252BRobustness%252Bof%252BApache%252BTika%26source%3Dgmail-imap%26ust%3D1622826254000000%26usg%3DAOvVaw0p_ynGwlHapvMiy24sF1FP&source=gmail-imap&ust=1623242610000000&usg=AOvVaw2un_ETGBn01eVOW2jexxPL
>> >> > >>>>>>>
>> >> > >>>>>>> Also, which version of Tika are you on? There have been some
>> contributions
>> >> > >>>>>>> recently around monitoring the server, which you might want
>> to upgrade
>> >> > >>>>>>> for, eg TIKA-3353
>> >> > >>>>>>>
>> >> > >>>>>>> Nick
>> >> >
>>
>

Re: best practices for avoiding OOM for tika docker

Reply via email to