Superskyyy opened a new issue, #10408:
URL: https://github.com/apache/skywalking/issues/10408

   ### Search before asking
   
   - [X] I had searched in the 
[issues](https://github.com/apache/skywalking/issues?q=is%3Aissue) and found no 
similar feature requirement.
   
   
   ### Description
   
   Currently, our Python agent is implemented with the Threading module to 
provide data reporters. Yet with the growth of the Python agent, it is now 
fully capable and requires more resources than when only tracing was supported 
(we start many threads and gRPC itself creates even more threads when 
streaming). 
   
   Casual testing shows using WRK to benchmark FastAPI, Uvicorn and agent 
together considerably reduces the throughput. 
   
   Background:
   
   In Python, Global Interpreter Lock at least before Python 3.12 (There's hope 
that GIL will be removed in somewhat near future) will limit that, **at any 
given time only one thread can execute their code,** with the exception of I/O 
time and C lib time. Since our data reporting is mostly I/O bound, threads will 
not block each other but they introduce a lot of wasted operation of switching 
threads around to see if they have completed their I/O tasks.
   
   *Asyncio*: Asyncio is a built-in Python library to provide 
cooperative-multitasking (async/await) coroutines. Each of our used protocols 
(gRPC, HTTP, Kafka) have mature support for asyncio-based clients. This will 
totally eliminate thread-switching cost within the agent scope and we gain 
finer control over the I/O wait.
   
   *Uvloop*: Uviloop is also a mature library that is used by Uvicorn as a 
drop-in replacement for the built-in Python event loop. It can provide a 2-4x 
speed up to the native event loop.
   
   The plan is to deprecate or provide an alternative implementation of data 
reporters (Trace/Log/Meter), maybe also for profilers. The alternative 
implementation should also work for gRPC/HTTP/Kafka using corresponding async 
clients.
   
   gRPC with asyncio API: https://grpc.github.io/grpc/python/grpc_asyncio.html
   
   To replace: Sync API
   
   HTTP: AIOHTTP/ HTTPX, I personally prefer aiohttp since it's more mature, 
but both should be easily swappable and okay. (why not try both and see what is 
better).
   
   To replace: Requests
   
   
   Kafka: 
   Confluent (seems better yet asyncio support is a bit shaky) 
https://www.confluent.io/blog/kafka-python-asyncio-integration/
   aiokafka (may not have that active maintenance) 
https://github.com/aio-libs/aiokafka
   
   To replace: kafka-python (plus it's unmaintained)
   
   Important consideration: Eventloop cannot survive forks, be careful to 
postpone agent start if a fork can be predicted (like gunicorn + Uvicorn worker 
can work properly). 
   
   Since FastAPI/ASGI-based web frameworks now dominate the Python web stack, 
and direct fork usage is very rare (eventloop can safely use 
processpool.executor) this can happen almost without breaking any user 
application.
   
   Old reporters should be slowly deprecated as a fallback for a release or two.
   
   
   
   ### Use case
   
   Make Python agent-introduced I/O overhead to user applications much lower.
   
   
   
   ### Related issues
   
   There could also be overhead in non-optimized span creation, yet the IO 
overhead should be addressed first as the primary target. 
   
   ### Are you willing to submit a PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://www.apache.org/foundation/policies/conduct)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: 
[email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to