Hi Robert,

Thank you for confirming that there is an issue.
I do not have a solution for it and would like to hear the committer insights 
what is wrong there.

I think there are actually two issues - the first one is the HBase InputFormat 
does not close a connection in close().
Another is DataSourceNode not calling the close() method.

Cheers,
Mark

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Thursday, August 27, 2020 3:30 PM, Robert Metzger <rmetz...@apache.org> 
wrote:

> Hi Mark,
>
> Thanks a lot for your message and the good investigation! I believe you've 
> found a bug in Flink. I filed an issue for the problem: 
> https://issues.apache.org/jira/browse/FLINK-19064.
>
> Would you be interested in opening a pull request to fix this?
> Otherwise, I'm sure a committer will pick up the issue soon.
>
> I'm not aware of a simple workaround for the problem.
>
> Best,
> Robert
>
> On Wed, Aug 26, 2020 at 4:05 PM Mark Davis <moda...@protonmail.com> wrote:
>
>> Hi,
>>
>> I am trying to investigate a problem with non-released resources in my 
>> application.
>>
>> I have a stateful application which submits Flink DataSetjobs using code 
>> very similar to the code in CliFrontend.
>> I noticed what I am getting a lot of non-closed connections to my data store 
>> (HBase in my case). The connections are held by the application not the jobs 
>> themselves.
>>
>> I am using HBaseRowDataInputFormat and it seems that HBase connections 
>> opened in the configure() method during the job graph creation(before the 
>> jobs is executed) are not closed. My search lead me to the method 
>> DataSourceNode.computeOperatorSpecificDefaultEstimates(DataStatistics) where 
>> I see that a format is not closed after being configured.
>>
>> Is that correct? How can I overcome this issue?
>>
>> My application is long running that is probably why I observe the resource 
>> leak. Would I spawn a new JVM to run jobs this problem would not be 
>> noticeable.
>>
>> Thank you!
>>
>> Cheers,
>> Marc

Reply via email to