[ https://issues.apache.org/jira/browse/HBASE-3792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Esteban Gutierrez resolved HBASE-3792. -------------------------------------- Resolution: Won't Fix > TableInputFormat leaks ZK connections > ------------------------------------- > > Key: HBASE-3792 > URL: https://issues.apache.org/jira/browse/HBASE-3792 > Project: HBase > Issue Type: Bug > Components: mapreduce > Affects Versions: 0.90.1 > Environment: Java 1.6.0_24, Mac OS X 10.6.7 > Reporter: Bryan Keller > Attachments: patch0.90.4, tableinput.patch > > > The TableInputFormat creates an HTable using a new Configuration object, and > it never cleans it up. When running a Mapper, the TableInputFormat is > instantiated and the ZK connection is created. While this connection is not > explicitly cleaned up, the Mapper process eventually exits and thus the > connection is closed. Ideally the TableRecordReader would close the > connection in its close() method rather than relying on the process to die > for connection cleanup. This is fairly easy to implement by overriding > TableRecordReader, and also overriding TableInputFormat to specify the new > record reader. > The leak occurs when the JobClient is initializing and needs to retrieves the > splits. To get the splits, it instantiates a TableInputFormat. Doing so > creates a ZK connection that is never cleaned up. Unlike the mapper, however, > my job client process does not die. Thus the ZK connections accumulate. > I was able to fix the problem by writing my own TableInputFormat that does > not initialize the HTable in the getConf() method and does not have an HTable > member variable. Rather, it has a variable for the table name. The HTable is > instantiated where needed and then cleaned up. For example, in the > getSplits() method, I create the HTable, then close the connection once the > splits are retrieved. I also create the HTable when creating the record > reader, and I have a record reader that closes the connection when done. -- This message was sent by Atlassian JIRA (v6.3.4#6332)