sravani-revuri commented on PR #10523: URL: https://github.com/apache/ozone/pull/10523#issuecomment-4806865885
From seeing the logs the this seems to be the main cause of the error: ``` Caused by: java.util.concurrent.RejectedExecutionException: Task okhttp3.internal.connection.RealCall$AsyncCall@611166bd rejected from java.util.concurrent.ThreadPoolExecutor@73163d48[Running, pool size = 5, active threads = 5, queued tasks = 0, completed tasks = 5] ``` ### Issue: the exporter isn't able to send data to the collector (jaeger) and drops spans because of all the threads currently being occupied. there could be 2 causes for this. 1) jaeger container is down . hence the exporter isn't able to export the spans to the collector and therefore they're being dropped. (ruling out because logs don't show anything that suggest this) 2) exporter isn't able to keep up with the load of spans being sent to it. The error is seen across om and scm logs too. - multiple commands are being run and the sampling level has been set to 100% causing everything to be recorded. - This might cause intermittent failures between runs which depends on the burst of spans for that specific run. - Also this code currently uses the simplespanprocessor which sends data to the exporter as and when span.end() is called. changing to batchspanprocessing to batch spans into fewer exports reduces this . @adoroszlai. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
