Thank you, Ismael. I will switch to ConcurrentHashMap later today if there
are no objections.

Regards,

Rajini

On Tue, Dec 22, 2015 at 11:57 AM, Ismael Juma <[email protected]> wrote:

> Hi Rajini,
>
> Looking at git annotate, the choice of CopyOnWriteMap was done in the
> original commit that introduced the Producer. For cases where writes are
> very rare, it can be a win, but it's problematic as soon as this assumption
> changes (particularly if the map is large). We switched from CopyOnWriteMap
> to ConcurrentHashMap in the Metrics class, as a result (see KAFKA-2664 for
> details). It sounds like we may want to change it here too.
>
> Best,
> Ismael
>
> On Tue, Dec 22, 2015 at 11:41 AM, Rajini Sivaram <
> [email protected]> wrote:
>
> > I was looking at removing unused partitions from
> > org.apache.kafka.clients.producer.internals.RecordAccumulator#batches to
> > avoid the map growing indefinitely, especially in the REST service. The
> PR
> > under https://issues.apache.org/jira/browse/KAFKA-2948 has the details.
> > With CopyOnWriteMap, removing entries requires synchronization, and there
> > are places in the code like dequeFor() which relies on partitions never
> > being removed from the map. I wasn't sure why CopyOnWriteMap was chosen
> for
> > this map, instead of perhaps ConcurrentHashMap. This would avoid large
> > copies of maps under a lock when there are large number of partitions,
> and
> > also make it easier to remove entries from the map. But I imagine there
> > must have been a reason why CopyOnWriteMap was preferred choice.
> >
> > Thoughts?
> >
> >
> > Regards,
> >
> > Rajini
> >
>

Reply via email to