bzp2010 commented on code in PR #12338:
URL: https://github.com/apache/apisix/pull/12338#discussion_r2159879100


##########
apisix/utils/batch-processor-manager.lua:
##########
@@ -94,11 +113,22 @@ function _M:add_entry(conf, entry)
     end
 
     log_buffer:push(entry)
+    self.total_pushed_entries = self.total_pushed_entries + 1
     return true
 end
 
 
-function _M:add_entry_to_new_processor(conf, entry, ctx, func)
+function _M:add_entry_to_new_processor(conf, entry, ctx, func, 
max_pending_entries)
+    if max_pending_entries then
+        local total_processed_entries_count = total_processed_entries(self)
+        if self.total_pushed_entries - total_processed_entries_count > 
max_pending_entries then
+            core.log.error("max pending entries limit exceeded. discarding 
entry.",
+                           " total_pushed_entries: ", 
self.total_pushed_entries,
+                           " total_processed_entries: ", 
total_processed_entries_count,
+                           " max_pending_entries: ", max_pending_entries)

Review Comment:
   Also both achieve their purpose by discarding log entries, how does it 
differ from the existing control of the maximum buffer achieved by reducing the 
number of retries?
   
   > Specifically, by reducing `max_retry_count` to `0` or `1` and configuring 
the `retry_delay` to a shorter time, the maximum amount of time that each log 
stays in the buffer can be precisely controlled.
   > All of this is an existing mechanism, what added value is provided by the 
additional, new mechanism that will bring more confusion?
   
   This mechanism discards logs before data is added to the buffer, dealing a 
serious blow to the reliability and meaning of logging. What's the point of a 
logging mechanism that isn't even remotely explainable if logs can be discarded 
on the fly and without warning?
   A log may be discarded at runtime under a variety of different conditions, 
and the user will find that a batch of logs critical to his diagnosis of a 
failure has strangely disappeared from the dataset, an obvious blow to his 
confidence.
   
   ----
   
   If some guys are causing massive memory consumption or even OOM because of 
all this caching. 
   Obviously the first thing that should come under scrutiny is whether its 
log-receiving system has enough capacity to accept the entirety of the logs, 
and if it doesn't, that's not APISIX's concern, and it's not our problem. 
Reliable delivery of diagnostic material at all times is what we should be 
looking at.
   The second step that should be scrutinized is whether these users have 
sufficient memory resources (both the memory limit of the pod and the entire 
node) reserved for APISIX operation. What does it matter to us if none of their 
memory consumption has been properly evaluated and thus pre-configured with 
unreasonable memory limits, resulting in processes being killed by the kernel?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscr...@apisix.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to