[ https://issues.apache.org/jira/browse/PHOENIX-4489?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16315655#comment-16315655 ]
Karan Mehta commented on PHOENIX-4489: -------------------------------------- [~vincentpoon] Technically, as we discussed it shouldn't be a problem since we go out of scope real quick after the generateSplits() method is executed and the connection object should be garbage collected. However, if you checkout PHOENIX-4503, the client is trying to read multiple spark dataframes inside a loop (almost 50 times). Such a code will get executed fast and will result in lots of HConnections and ZKConnections getting created in a short span of time and I suspect that even though GC gets triggered to clear them, it might actually take some time before this to happen (until JVM feels the need). This can cause issues with the application. I see many issues filed in this regard. Also, since the connections are not instantiated via factory, it is difficult to catch their quantity and limit the resources by having a custom implementation. What do you think? FYI, [~aertoria] > HBase Connection leak in Phoenix MR Jobs > ---------------------------------------- > > Key: PHOENIX-4489 > URL: https://issues.apache.org/jira/browse/PHOENIX-4489 > Project: Phoenix > Issue Type: Bug > Reporter: Karan Mehta > Assignee: Karan Mehta > Attachments: PHOENIX-4489.001.patch > > > Phoenix MR jobs uses a custom class {{PhoenixInputFormat}} to determine the > splits and the parallelism of the work. The class directly opens up a HBase > connection, which is not closed after the usage. Independently running MR > jobs should not have any concern, however jobs that run through Phoenix-Spark > can cause leak issues if this is left unclosed (since those jobs run as a > part of same JVM). > Apart from this, the connection should be instantiated with > {{HBaseFactoryProvider.getHConnectionFactory()}} instead of the default one. > It can be useful if a separate client is trying to run jobs and wants to > provide a custom implementation of {{HConnection}}. > [~jmahonin] Any ideas? > [~jamestaylor] [~vincentpoon] Any concerns around this? -- This message was sent by Atlassian JIRA (v6.4.14#64029)