Hi, I am seeing some cases with Tika 2.2.1 where despite setting a watchdog to limit the heap to 3GB, the entire Tika container exceeds 6GB and that exceeds the resource memory limit, so it gets OOM-ed. Here is one example:
total-vm:8109464kB, anon-rss:99780kB, file-rss:28204kB, shmem-rss:32kB, UID:0 pgtables:700kB oom_score_adj:-997 Only some files seem to be causing this behavior. The memory ramps up fairly quickly, in a few tens of seconds it can go from 1GB to 6GB. The next step is to check if this goes away with 2.8.0, but I wonder if any of the following explanations make any sense: 1. The JVM is slow to observe the forked process exceeding its heap and does not terminate it in time 2. It's not the heap that grows, but there is some stack overflow due to very deep recursion. Finally, are there any file types that are known to use a lot of memory with Tika? Thanks, Cristi