[ 
https://issues.apache.org/jira/browse/TIKA-3441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17360201#comment-17360201
 ] 

Cristian Zamfir commented on TIKA-3441:
---------------------------------------

[~tallison] - unfortunately there is no container built for 1.27 available un 
hub.docker.com so it makes it harder to reproduce. I will upgrade to 1.26 
though.

[~tallison] - I think the defunct tesseract processes could explain it. Since 
they were lingering there for a while, looks like the solution is to explicitly 
kill them. 

 

[~ndipiazza] 
 * I am using the docker image apache/tika:latest-full (side note: this appears 
to be 1.25 instead of 1.26, but most likely the intention is to use 1.26 in 
latest). 
 * The custom parameters (can also be seen in the environment field in Jira) 
are: -spawnChild TIKA_CHILD_JVM_OPTS=-JXmx3g -JXX:+ExitOnOutOfMemoryError 
-status"
 * To reproduce,  just feed it a lot of files. In this particular case the 
issue appeared after a few hours and after a parse timeout error. That timeout 
caused a restart of the tika child which then got the bind error. 
 * Tesseract is enabled using the default settings in the apache docker image.
 * I mentioned some other environment factors like k8s horizontal scaling, you 
can scrape that, I am pretty sure that this is an issue with the OS keeping the 
port open so the issue can manifest with a single tika docker container.

 

 

 

> tika server stuck in loop trying to bind
> ----------------------------------------
>
>                 Key: TIKA-3441
>                 URL: https://issues.apache.org/jira/browse/TIKA-3441
>             Project: Tika
>          Issue Type: Bug
>          Components: docker, server
>    Affects Versions: 1.25
>         Environment: {{PID TTY STAT TIME COMMAND}}
> {{ 1 ? Ssl 2:54 java -jar /tika-server-1.25.jar -h 0.0.0.0 -spawnChild 
> TIKA_CHILD_JVM_OPTS=-JXmx3g -JXX:+ExitOnOutOfMemoryError -status}}
> {{ 127 ? Z 1:15 [tesseract] <defunct>}}
> {{ 131 ? Z 1:31 [tesseract] <defunct>}}
> {{ 139 ? Z 0:16 [tesseract] <defunct>}}
> {{ 219 ? Z 40:20 [tesseract] <defunct>}}
> {{ 324 ? Z 2:32 [tesseract] <defunct>}}
> {{ 342 ? Z 0:09 [tesseract] <defunct>}}
> {{ 343 ? Z 0:09 [tesseract] <defunct>}}
> {{ 380 ? Z 0:00 [tesseract] <defunct>}}
> {{ 430 ? Z 9:18 [tesseract] <defunct>}}
> {{ 435 ? Z 4:52 [tesseract] <defunct>}}
> {{ 446 ? Z 18:34 [tesseract] <defunct>}}
> {{ 453 ? Z 22:33 [tesseract] <defunct>}}
> {{ 461 ? Z 0:25 [tesseract] <defunct>}}
> {{ 517 ? Z 4:35 [tesseract] <defunct>}}
> {{ 526 ? Z 0:04 [tesseract] <defunct>}}
> {{ 536 ? S 0:00 36:39}}
> {{ 881708 pts/0 Ss 0:00 bash}}
> {{ 883782 ? Sl 0:00 java -XX:+ExitOnOutOfMemoryError -Djava.awt.headless=true 
> -cp /tika-server-1.25.jar org.apache.tika.server.TikaServerCli -h 0.0.0.0 
> TIKA_CHILD_JVM_OPTS=-JXmx3g -status --id 
> ab854166-0a4b-4ecc-93a5-e4504d11e4e8}}
> {{ 883801 pts/0 R+ 0:00 ps ax}}
>            Reporter: Cristian Zamfir
>            Priority: Major
>         Attachments: bind_issue.txt, logs_before.txt
>
>
> Tika seems to be stuck in a loop trying to bind. Please see the attached logs.
>  It is also using a lot of CPU while doing that, so in an environment with 
> horizontal scaling it causes other containers to spin up.
> This seems to happen from time to time to one of the multiple containers 
> running, normally everything runs smoothly.  
> Not sure why it fails to bind in the first place (it's running in docker).
> I suspect that it may be related to the fact that it is trying to bind very 
> quickly.
> Ideas on how to debug this issue further are welcome.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to