Hi!

I noticed that Tika prints in the logs OOM (null), but seems to recover by
itself even when not using -spawnChild. Is this the expected behavior? I am
trying to figure out when logs containing "OOM" are critical and would
require a container restart.

I also wanted to bring up two of my questions below, I am looking forward
to your feedback:
1. Do you have a recommendation for a stress test that would allow me to
easily test OOM behavior?
2. For implementing a health check that detects when Tika is stuck, I could
periodically send a simple request and check that the reply is correct, do
you recommend a better approach?

Thanks,
Cristi

On Sat, May 29, 2021 at 2:58 PM Cristian Zamfir <[email protected]>
wrote:

>
> > On 28 May 2021, at 19:03, Tim Allison <[email protected]> wrote:
> >
> > Tika 2.x should help with this in pipes and async.  Your system should
> > expect to go oom or crash at some point if you're processing enough
> > files.
>
> I believe that this is what is happening in my case, it’s not due to a
> single file, it happens under high load when processing many files at once.
>
> >
> > Right --spawnChild is not default in 1.x, but it will be in 2.x.  And,
> > yes, you should be using it. To set the Xmx in the forked process add
> > -J, as in -JXmx2g would set the Xmx for the forked process.
>
>
> Did both now and I think this provides good recovery from OOM.
>
>
> >
> > I don't have experience to recommend bumping Xmx to close to your
> > container's max memory. In java programs that do a bunch of work off
> > heap, this would be a bad idea because you need to leave resources for
> > your system os, but I don't think we do much off heap.
>
> What’s your take on a configuration in which the container is capped at
> 4GB and the spawned child has a heap limit of 3GB? Sounds like a pretty
> safe margin to me.
>
> >
> > Which file types are causing OOMs?  The MP4Parser is notorious, and
> > we're looking to swap it out in 2.x for a different parser.
>
> Good to hear. I don’t know how to identify the root cause because there
> are many files sent at once.
> However, it would be great to learn if there is a quick way to trigger a
> high load and test resiliency to OOM, do you have a recommendation?
>
>
> >
> > Yep, TIKA-3353 is the monitoring that Nick was mentioning.
>
> I am actually more interested in health checks, to detect when the system
> is stuck without automatically restarting. A built-in health check would
> certainly be a nice feature.
>
> Besides OOM, one other possible cause is if /tmp gets full - for instance
> I see here
> https://github.com/tongwang/tika-server-docker/blob/master/bin/healthcheck
> that /tmp is cleaned up periodically and the health check fails if it is
> too full.
>
> Are there any other situations that could indicate that the container is
> stuck and needs a restart and if yes, is there a way to detect the
> condition?
>
> Thanks,
> Cristi
>
> >
> > On Fri, May 28, 2021 at 9:08 AM Cristian Zamfir <[email protected]>
> wrote:
> >>
> >> Thanks for your answer Nick!
> >>
> >> I am running apache/tika:latest-full which is using 1.25. Looks like I
> need at least version 1.26 for
> https://www.google.com/url?q=https://issues.apache.org/jira/browse/TIKA-3353&source=gmail-imap&ust=1622826254000000&usg=AOvVaw1we1l0Sh-gWif4FqbZ2qek,
> but I am not sure if this is not overkill for implementing basic liveness
> health checks.
> >>
> >> It's clear that –spawnChild and ForkParser are two must-haves that
> AFAIU are not default in apache/tika:latest-full
> >>
> >> My guess is that I also need to set the jvm heap size close to the
> memory resource limit for the container, but that's not ideal because the
> heap size would be statically configured while the memory resource limits
> are dynamic. Or maybe this is not necessary if I use -spawnChild?
> >>
> >> I am looking forward to your answers, thanks a lot!
> >>
> >> Cristi
> >>
> >>
> >> On Fri, May 28, 2021 at 2:55 PM Nick Burch <[email protected]>
> wrote:
> >>>
> >>> On Thu, 27 May 2021, Cristian Zamfir wrote:
> >>>> I am running some stress tests of the latest tika server docker (not
> >>>> modified in any way, just pulled from the registry) and seeing that
> after a
> >>>> few hours I see OOM in the logs. The container has a limit of 4GB set
> in
> >>>> K8S. I am wondering if you have any best practices on how to avoid
> this.
> >>>
> >>> Hopefully one of our Tika+Docker experts will be along in a minute to
> help
> >>> advise!
> >>>
> >>> For now, the general advice is documented at:
> >>>
> https://www.google.com/url?q=https://cwiki.apache.org/confluence/display/TIKA/The%2BRobustness%2Bof%2BApache%2BTika&source=gmail-imap&ust=1622826254000000&usg=AOvVaw0p_ynGwlHapvMiy24sF1FP
> >>>
> >>> Also, which version of Tika are you on? There have been some
> contributions
> >>> recently around monitoring the server, which you might want to upgrade
> >>> for, eg TIKA-3353
> >>>
> >>> Nick
>
>

Reply via email to