devkanro opened a new issue #4282: Collector base on async stack
URL: https://github.com/apache/skywalking/issues/4282
 
 
   ## Theme
   This issue discusses async segment collectors.
   
   ## Problem
   I found there is bad throughput about OAP server collect tracing.
   I have tried to use skywalking in my production env, but there are too many 
segments to collect.
   And I scale the OAP server to 8c16gx8 instances, there are also too many 
errors about collecting(gRPC client canceled.)
   The CPU looks fine about OAP server and ES server.
   
   Maybe separate collectors from OAP server is a good idea?
   Agents are not calling OAP server to collect spans directly, use the 
MessageQueue or log collector to do it, and make collectors consume those 
messages or log, one collector maybe 1c2g/2c4g, we can have many collectors 
with low cost.
   
   ## Opinion
   I can enumerate many advantages of it.
   
   01. Improve the performance of agents
   Agents do not need the result from collectors, they just write and write 
again.
   02. Elastic scale
   OAP server is heavy and expensive, it at least needs 4c8g. But one collector 
just needs 1c2g or smaller, because it just needs to handle one segment at a 
time.
   03. Delay for stability and safety
   When collectors can't support agents, the message queue will keep the data, 
it will not cause data loss, just some delay.
   04. Throughput base on collector numbers
   If you have more collectors, you will have more message throughput.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

Reply via email to