bzp2010 commented on code in PR #12338: URL: https://github.com/apache/apisix/pull/12338#discussion_r2159879100
########## apisix/utils/batch-processor-manager.lua: ########## @@ -94,11 +113,22 @@ function _M:add_entry(conf, entry) end log_buffer:push(entry) + self.total_pushed_entries = self.total_pushed_entries + 1 return true end -function _M:add_entry_to_new_processor(conf, entry, ctx, func) +function _M:add_entry_to_new_processor(conf, entry, ctx, func, max_pending_entries) + if max_pending_entries then + local total_processed_entries_count = total_processed_entries(self) + if self.total_pushed_entries - total_processed_entries_count > max_pending_entries then + core.log.error("max pending entries limit exceeded. discarding entry.", + " total_pushed_entries: ", self.total_pushed_entries, + " total_processed_entries: ", total_processed_entries_count, + " max_pending_entries: ", max_pending_entries) Review Comment: Also both achieve their purpose by discarding log entries, how does it differ from the existing control of the maximum buffer achieved by reducing the number of retries? > Specifically, by reducing `max_retry_count` to `0` or `1` and configuring the `retry_delay` to a shorter time, the maximum amount of time that each log stays in the buffer can be precisely controlled. > All of this is an existing mechanism, what added value is provided by the additional, new mechanism that will bring more confusion? This mechanism discards logs before data is added to the buffer, dealing a serious blow to the reliability and meaning of logging. What's the point of a logging mechanism that isn't even remotely explainable if logs can be discarded on the fly and without warning? A log may be discarded at runtime under a variety of different conditions, and the user will find that a batch of logs critical to his diagnosis of a failure has strangely disappeared from the dataset, an obvious blow to his confidence. ---- If some guys are causing massive memory consumption or even OOM because of all this caching. Obviously the first thing that should come under scrutiny is whether its log-receiving system has enough capacity to accept the entirety of the logs, and if it doesn't, that's not APISIX's concern, and it's not our problem. Reliable delivery of diagnostic material at all times is what we should be looking at. The second step that should be scrutinized is whether these users have sufficient memory resources (both the memory limit of the pod and the entire node) reserved for APISIX operation. What does it matter to us if none of their memory consumption has been properly evaluated and thus pre-configured with unreasonable memory limits, resulting in processes being killed by the kernel? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: notifications-unsubscr...@apisix.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org