Thank you, Ismael. I will switch to ConcurrentHashMap later today if there are no objections.
Regards, Rajini On Tue, Dec 22, 2015 at 11:57 AM, Ismael Juma <[email protected]> wrote: > Hi Rajini, > > Looking at git annotate, the choice of CopyOnWriteMap was done in the > original commit that introduced the Producer. For cases where writes are > very rare, it can be a win, but it's problematic as soon as this assumption > changes (particularly if the map is large). We switched from CopyOnWriteMap > to ConcurrentHashMap in the Metrics class, as a result (see KAFKA-2664 for > details). It sounds like we may want to change it here too. > > Best, > Ismael > > On Tue, Dec 22, 2015 at 11:41 AM, Rajini Sivaram < > [email protected]> wrote: > > > I was looking at removing unused partitions from > > org.apache.kafka.clients.producer.internals.RecordAccumulator#batches to > > avoid the map growing indefinitely, especially in the REST service. The > PR > > under https://issues.apache.org/jira/browse/KAFKA-2948 has the details. > > With CopyOnWriteMap, removing entries requires synchronization, and there > > are places in the code like dequeFor() which relies on partitions never > > being removed from the map. I wasn't sure why CopyOnWriteMap was chosen > for > > this map, instead of perhaps ConcurrentHashMap. This would avoid large > > copies of maps under a lock when there are large number of partitions, > and > > also make it easier to remove entries from the map. But I imagine there > > must have been a reason why CopyOnWriteMap was preferred choice. > > > > Thoughts? > > > > > > Regards, > > > > Rajini > > >
