sravani-revuri commented on PR #10523:
URL: https://github.com/apache/ozone/pull/10523#issuecomment-4806865885

   From seeing the logs the this seems to be the main cause of the error:
   
   ```
   Caused by: java.util.concurrent.RejectedExecutionException: Task 
okhttp3.internal.connection.RealCall$AsyncCall@611166bd rejected from 
java.util.concurrent.ThreadPoolExecutor@73163d48[Running, pool size = 5, active 
threads = 5, queued tasks = 0, completed tasks = 5]
   ```
   
   ### Issue: 
   the exporter isn't able to send data to the collector (jaeger) and drops 
spans because of all the threads currently being occupied.
   there could be 2 causes for this.
   1) jaeger container is down . hence the exporter isn't able to export the 
spans to the collector and therefore they're being dropped.  (ruling out 
because logs don't show anything that suggest this)
   2) exporter isn't able to keep up with the load of spans being sent to it. 
   
   The error is seen across om and scm logs too.
   
   - multiple commands are being run and the sampling level has been set to 
100% causing everything to be recorded.
   - This might cause intermittent failures between runs which depends on the 
burst of spans for that specific run. 
   - Also this code currently uses the simplespanprocessor which sends data to 
the exporter as and when span.end() is called. changing to batchspanprocessing 
to batch spans into fewer exports reduces this .
   
   @adoroszlai.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to