I submitted a patch attempt in the jira.

On Tue, Apr 14, 2015 at 10:16 AM, Steven Wu <stevenz...@gmail.com> wrote:

> Thanks, Ewen and Guozhang!
>
> I will go with the try-catch option then. here is the jira. feel free to
> assign it to me. I will try to submit a patch this week.
> https://issues.apache.org/jira/browse/KAFKA-2121
>
>
> On Mon, Apr 13, 2015 at 7:17 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>
>> It is a valid problem and we should correct it as soon as possible, I'm
>> with Ewen regarding the solution.
>>
>> On Mon, Apr 13, 2015 at 5:05 PM, Ewen Cheslack-Postava <e...@confluent.io
>> >
>> wrote:
>>
>> > Steven,
>> >
>> > Looks like there is even more that could potentially be leaked -- since
>> key
>> > and value serializers are created and configured at the end, even the IO
>> > thread allocated by the producer could leak. Given that, I think 1
>> isn't a
>> > great option since, as you said, it doesn't really address the
>> underlying
>> > issue.
>> >
>> > 3 strikes me as bad from a user experience perspective. It's true we
>> might
>> > want to introduce additional constructors to make testing easier, but
>> the
>> > more components I need to allocate myself and inject into the producer's
>> > constructor, the worse the default experience is. And since you would
>> have
>> > to inject the dependencies to get correct, non-leaking behavior, it will
>> > always be more code than previously (and a backwards incompatible
>> change).
>> > Additionally, the code creating a the producer would have be more
>> > complicated since it would have to deal with the cleanup carefully
>> whereas
>> > it previously just had to deal with the exception. Besides, for testing
>> > specifically, you can avoid exposing more constructors just for testing
>> by
>> > using something like PowerMock that let you mock private methods. That
>> > requires a bit of code reorganization, but doesn't affect the public
>> > interface at all.
>> >
>> > So my take is that a variant of 2 is probably best. I'd probably do two
>> > things. First, make close() safe to call even if some fields haven't
>> been
>> > initialized, which presumably just means checking for null fields. (You
>> > might also want to figure out if all the methods close() calls are
>> > idempotent and decide whether some fields should be marked non-final and
>> > cleared to null when close() is called). Second, add the try/catch as
>> you
>> > suggested, but just use close().
>> >
>> > -Ewen
>> >
>> >
>> > On Mon, Apr 13, 2015 at 3:53 PM, Steven Wu <stevenz...@gmail.com>
>> wrote:
>> >
>> > > Here is the resource leak problem that we have encountered when 0.8.2
>> > java
>> > > KafkaProducer failed in constructor. here is the code snippet of
>> > > KafkaProducer to illustrate the problem.
>> > >
>> > > -------------------------------
>> > > public KafkaProducer(ProducerConfig config, Serializer<K>
>> keySerializer,
>> > > Serializer<V> valueSerializer) {
>> > >
>> > >     // create metrcis reporter via reflection
>> > >     List<MetricsReporter> reporters =
>> > >
>> > >
>> >
>> config.getConfiguredInstances(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG,
>> > > MetricsReporter.class);
>> > >
>> > >     // validate bootstrap servers
>> > >     List<InetSocketAddress> addresses =
>> > >
>> > >
>> >
>> ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG));
>> > >
>> > > }
>> > > -------------------------------
>> > >
>> > > let's say MyMetricsReporter creates a thread in constructor. if
>> hostname
>> > > validation threw an exception, constructor won't call the close
>> method of
>> > > MyMetricsReporter to clean up the resource. as a result, we created
>> > thread
>> > > leak issue. this becomes worse when we try to auto recovery (i.e. keep
>> > > creating KafkaProducer again -> failing again -> more thread leaks).
>> > >
>> > > there are multiple options of fixing this.
>> > >
>> > > 1) just move the hostname validation to the beginning. but this is
>> only
>> > fix
>> > > one symtom. it didn't fix the fundamental problem. what if some other
>> > lines
>> > > throw an exception.
>> > >
>> > > 2) use try-catch. in the catch section, try to call close methods for
>> any
>> > > non-null objects constructed so far.
>> > >
>> > > 3) explicitly declare the dependency in the constructor. this way,
>> when
>> > > KafkaProducer threw an exception, I can call close method of metrics
>> > > reporters for releasing resources.
>> > >     KafkaProducer(..., List<MetricsReporter> reporters)
>> > > we don't have to dependency injection framework. but generally hiding
>> > > dependency is a bad coding practice. it is also hard to plug in mocks
>> for
>> > > dependencies. this is probably the most intrusive change.
>> > >
>> > > I am willing to submit a patch. but like to hear your opinions on how
>> we
>> > > should fix the issue.
>> > >
>> > > Thanks,
>> > > Steven
>> > >
>> >
>> >
>> >
>> > --
>> > Thanks,
>> > Ewen
>> >
>>
>>
>>
>> --
>> -- Guozhang
>>
>
>

Reply via email to