I submitted a patch attempt in the jira. On Tue, Apr 14, 2015 at 10:16 AM, Steven Wu <stevenz...@gmail.com> wrote:
> Thanks, Ewen and Guozhang! > > I will go with the try-catch option then. here is the jira. feel free to > assign it to me. I will try to submit a patch this week. > https://issues.apache.org/jira/browse/KAFKA-2121 > > > On Mon, Apr 13, 2015 at 7:17 PM, Guozhang Wang <wangg...@gmail.com> wrote: > >> It is a valid problem and we should correct it as soon as possible, I'm >> with Ewen regarding the solution. >> >> On Mon, Apr 13, 2015 at 5:05 PM, Ewen Cheslack-Postava <e...@confluent.io >> > >> wrote: >> >> > Steven, >> > >> > Looks like there is even more that could potentially be leaked -- since >> key >> > and value serializers are created and configured at the end, even the IO >> > thread allocated by the producer could leak. Given that, I think 1 >> isn't a >> > great option since, as you said, it doesn't really address the >> underlying >> > issue. >> > >> > 3 strikes me as bad from a user experience perspective. It's true we >> might >> > want to introduce additional constructors to make testing easier, but >> the >> > more components I need to allocate myself and inject into the producer's >> > constructor, the worse the default experience is. And since you would >> have >> > to inject the dependencies to get correct, non-leaking behavior, it will >> > always be more code than previously (and a backwards incompatible >> change). >> > Additionally, the code creating a the producer would have be more >> > complicated since it would have to deal with the cleanup carefully >> whereas >> > it previously just had to deal with the exception. Besides, for testing >> > specifically, you can avoid exposing more constructors just for testing >> by >> > using something like PowerMock that let you mock private methods. That >> > requires a bit of code reorganization, but doesn't affect the public >> > interface at all. >> > >> > So my take is that a variant of 2 is probably best. I'd probably do two >> > things. First, make close() safe to call even if some fields haven't >> been >> > initialized, which presumably just means checking for null fields. (You >> > might also want to figure out if all the methods close() calls are >> > idempotent and decide whether some fields should be marked non-final and >> > cleared to null when close() is called). Second, add the try/catch as >> you >> > suggested, but just use close(). >> > >> > -Ewen >> > >> > >> > On Mon, Apr 13, 2015 at 3:53 PM, Steven Wu <stevenz...@gmail.com> >> wrote: >> > >> > > Here is the resource leak problem that we have encountered when 0.8.2 >> > java >> > > KafkaProducer failed in constructor. here is the code snippet of >> > > KafkaProducer to illustrate the problem. >> > > >> > > ------------------------------- >> > > public KafkaProducer(ProducerConfig config, Serializer<K> >> keySerializer, >> > > Serializer<V> valueSerializer) { >> > > >> > > // create metrcis reporter via reflection >> > > List<MetricsReporter> reporters = >> > > >> > > >> > >> config.getConfiguredInstances(ProducerConfig.METRIC_REPORTER_CLASSES_CONFIG, >> > > MetricsReporter.class); >> > > >> > > // validate bootstrap servers >> > > List<InetSocketAddress> addresses = >> > > >> > > >> > >> ClientUtils.parseAndValidateAddresses(config.getList(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG)); >> > > >> > > } >> > > ------------------------------- >> > > >> > > let's say MyMetricsReporter creates a thread in constructor. if >> hostname >> > > validation threw an exception, constructor won't call the close >> method of >> > > MyMetricsReporter to clean up the resource. as a result, we created >> > thread >> > > leak issue. this becomes worse when we try to auto recovery (i.e. keep >> > > creating KafkaProducer again -> failing again -> more thread leaks). >> > > >> > > there are multiple options of fixing this. >> > > >> > > 1) just move the hostname validation to the beginning. but this is >> only >> > fix >> > > one symtom. it didn't fix the fundamental problem. what if some other >> > lines >> > > throw an exception. >> > > >> > > 2) use try-catch. in the catch section, try to call close methods for >> any >> > > non-null objects constructed so far. >> > > >> > > 3) explicitly declare the dependency in the constructor. this way, >> when >> > > KafkaProducer threw an exception, I can call close method of metrics >> > > reporters for releasing resources. >> > > KafkaProducer(..., List<MetricsReporter> reporters) >> > > we don't have to dependency injection framework. but generally hiding >> > > dependency is a bad coding practice. it is also hard to plug in mocks >> for >> > > dependencies. this is probably the most intrusive change. >> > > >> > > I am willing to submit a patch. but like to hear your opinions on how >> we >> > > should fix the issue. >> > > >> > > Thanks, >> > > Steven >> > > >> > >> > >> > >> > -- >> > Thanks, >> > Ewen >> > >> >> >> >> -- >> -- Guozhang >> > >