[
https://issues.apache.org/jira/browse/SAMZA-2577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17175116#comment-17175116
]
Rayman commented on SAMZA-2577:
-------------------------------
Sample Log4j1 fix: [https://github.com/apache/samza/pull/1412/files]
> Threads appending to StreamAppender block/deadlock in high tput scenarios,
> leading to processing stalls
> -------------------------------------------------------------------------------------------------------
>
> Key: SAMZA-2577
> URL: https://issues.apache.org/jira/browse/SAMZA-2577
> Project: Samza
> Issue Type: Bug
> Reporter: Rayman
> Priority: Major
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Problem:
> In both StreamAppender for log4j1 and log4j2 a blocking queue is used to
> coordinate between the append()-ing threads and a single thread send()-ing to
> Kafka.
> This is a bounded, blocking, lock-synchronized queue.
> To avoid deadlock scenarios (see SAMZA-1537), the append()-ing threads have
> a timeout of 2 seconds, after which the log message is discarded and the
> queue is drained.
> This means in case of message bursts, threads calling append() may block for
> upto 2 seconds, and may continually be stuck in this pattern, leading to
> processing stalls and lowered throughput.
> *Solutions for Log4j2*
> Solution 1. Enable async logger in log4j2, since they are supported and
> provided in log4j2.[https://logging.apache.org/log4j/2.x/manual/async.html].
> In using this capability, the blocking-queue in StreamAppender is not
> required because the logger itself will be asynchronous, and so append()
> threads can directly call systemProducer.send().
> However, if async loggers are not used then this queue based mechanism, to
> give the append()-ing threads an "async" illusion, is required.
> Solution 2. Continue using the blocking bounded lock-based queue, but make
> the queue size and timeout configurable. Users can then tune this to account
> for message bursts.
> Solution 3. Move to use a lock-less queue, e.g., ConcurrentLinkedQueue
> (unbounded) or
> implement a bounded lock-less queue, or use [open-source
> implementations|[https://stackoverflow.com/questions/20890554/lock-free-circular-array]].
> Append()-ing threads will no longer need to block or timeout. However the
> caller may busy-wait or need a fixed-rate or fixed-sleep-time to avoid busy
> waits, since a lock-less queue is non blocking.
> It uses CAS operations.
> *For log4j2, we will adopt Solution 1.*
> *Solutions for Log4j1*
> Solution 1. Deprecate – log4j1 is not supported.
> Solution 2. Similar to Solution 2 above.
> Solution 3. Similar to Solution 3 above.
> *For log4j1, we will adopt Solution 1 – won't fix.*
--
This message was sent by Atlassian Jira
(v8.3.4#803005)