apaloleg commented on PR #12826: URL: https://github.com/apache/apisix/pull/12826#issuecomment-3691330390
> Hi @apaloleg, thanks for your contribution. I would like to ask if you have encountered the problem described in this PR while actually using APISIX? When evaluating the implementation solutions for this PR, we found that introducing incremental updates might alleviate the problem to some extent, but incremental updates are unreliable, and their reliability is lower than that of full updates after long-term operation. Therefore, I would like to hear about the problems you actually encountered to determine a reasonable solution. Hi! In our environment, we currently have over 40,000 routes (and this number is steadily growing over time), and we're using the `radixtree_host_uri` router. We have frequently routes updates (add, update, delete) — these can be small operations (1–5 routes) or large ones (50–100+ routes). A series of such operations can last from 3-5s to 10+ minutes. At the same time, we have constant traffic on the gateway: around 200–300 RPS on average, and up to 900–1000 RPS during peak hours. During these update periods, the CPU usage of APISIX pods spikes to 100%, and the latency of incoming requests increased significantly because requests had to wait for the full rebuild of the radix tree. This was a critical issue for us in production. The patch from this PR completely resolved our problem. We've been running it in our production environment for about a month now, and we no longer experience these CPU spikes or increased latency during this updates period. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
