[jira] [Updated] (HADOOP-17528) Not closing an SFTP File System instance prevents JVM from exiting.
[ https://issues.apache.org/jira/browse/HADOOP-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-17528: -- Resolution: Fixed Status: Resolved (was: Patch Available) > Not closing an SFTP File System instance prevents JVM from exiting. > > > Key: HADOOP-17528 > URL: https://issues.apache.org/jira/browse/HADOOP-17528 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Mikhail Pryakhin >Assignee: Mikhail Pryakhin >Priority: Major > Labels: pull-request-available > Time Spent: 2h 50m > Remaining Estimate: 0h > > SFTP file system leverages a connection pool which is not closed when a file > system instance gets closed preventing a JVM from exiting as every SFTP > connection runs in a separate non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Issue Comment Deleted] (HADOOP-17528) Not closing an SFTP File System instance prevents JVM from exiting.
[ https://issues.apache.org/jira/browse/HADOOP-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-17528: -- Comment: was deleted (was: I've created a [PR|https://github.com/apache/hadoop/pull/2701], could someone review it please?) > Not closing an SFTP File System instance prevents JVM from exiting. > > > Key: HADOOP-17528 > URL: https://issues.apache.org/jira/browse/HADOOP-17528 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Mikhail Pryakhin >Assignee: Mikhail Pryakhin >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > SFTP file system leverages a connection pool which is not closed when a file > system instance gets closed preventing a JVM from exiting as every SFTP > connection runs in a separate non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17528) Not closing an SFTP File System instance prevents JVM from exiting.
[ https://issues.apache.org/jira/browse/HADOOP-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17284564#comment-17284564 ] Mikhail Pryakhin commented on HADOOP-17528: --- I've created a [PR|https://github.com/apache/hadoop/pull/2701], could someone review it please? > Not closing an SFTP File System instance prevents JVM from exiting. > > > Key: HADOOP-17528 > URL: https://issues.apache.org/jira/browse/HADOOP-17528 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Mikhail Pryakhin >Assignee: Mikhail Pryakhin >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > SFTP file system leverages a connection pool which is not closed when a file > system instance gets closed preventing a JVM from exiting as every SFTP > connection runs in a separate non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17528) Not closing an SFTP File System instance prevents JVM from exiting.
[ https://issues.apache.org/jira/browse/HADOOP-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-17528: -- Affects Version/s: 3.2.0 Status: Patch Available (was: Open) https://github.com/apache/hadoop/pull/2701.patch > Not closing an SFTP File System instance prevents JVM from exiting. > > > Key: HADOOP-17528 > URL: https://issues.apache.org/jira/browse/HADOOP-17528 > Project: Hadoop Common > Issue Type: Bug >Affects Versions: 3.2.0 >Reporter: Mikhail Pryakhin >Assignee: Mikhail Pryakhin >Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > SFTP file system leverages a connection pool which is not closed when a file > system instance gets closed preventing a JVM from exiting as every SFTP > connection runs in a separate non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17528) Not closing an SFTP File System instance prevents JVM from exiting.
[ https://issues.apache.org/jira/browse/HADOOP-17528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-17528: -- Description: SFTP file system leverages a connection pool which is not closed when a file system instance gets closed preventing a JVM from exiting as every SFTP connection runs in a separate non-daemon thread. (was: Not closing an SFTP File System instance prevents JVM from exiting. SFTP file system leverages a connection pool which is not closed when a file system instance gets closed preventing a JVM from exiting as every SFTP connection runs in a separate non-daemon thread.) > Not closing an SFTP File System instance prevents JVM from exiting. > > > Key: HADOOP-17528 > URL: https://issues.apache.org/jira/browse/HADOOP-17528 > Project: Hadoop Common > Issue Type: Bug >Reporter: Mikhail Pryakhin >Assignee: Mikhail Pryakhin >Priority: Major > > SFTP file system leverages a connection pool which is not closed when a file > system instance gets closed preventing a JVM from exiting as every SFTP > connection runs in a separate non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Created] (HADOOP-17528) Not closing an SFTP File System instance prevents JVM from exiting.
Mikhail Pryakhin created HADOOP-17528: - Summary: Not closing an SFTP File System instance prevents JVM from exiting. Key: HADOOP-17528 URL: https://issues.apache.org/jira/browse/HADOOP-17528 Project: Hadoop Common Issue Type: Bug Reporter: Mikhail Pryakhin Assignee: Mikhail Pryakhin Not closing an SFTP File System instance prevents JVM from exiting. SFTP file system leverages a connection pool which is not closed when a file system instance gets closed preventing a JVM from exiting as every SFTP connection runs in a separate non-daemon thread. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118548#comment-17118548 ] Mikhail Pryakhin commented on HADOOP-14566: --- [~ste...@apache.org] thanks a lot for your review. All the issues you pointed out have been fixed. Could we let it to go in? > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17118548#comment-17118548 ] Mikhail Pryakhin edited comment on HADOOP-14566 at 5/28/20, 11:06 AM: -- [~ste...@apache.org] thanks a lot for your review. All the issues you pointed out have been fixed. Could we let it go in? was (Author: m.pryahin): [~ste...@apache.org] thanks a lot for your review. All the issues you pointed out have been fixed. Could we let it to go in? > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17111308#comment-17111308 ] Mikhail Pryakhin commented on HADOOP-14566: --- Hey [~ste...@apache.org], I'm just writing to ask whether it could be possible to review the changes introduced by the patch as they're quite important for me. Thanks :) > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17036) TestFTPFileSystem failing as ftp server dir already exists
[ https://issues.apache.org/jira/browse/HADOOP-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107554#comment-17107554 ] Mikhail Pryakhin commented on HADOOP-17036: --- Why did it fail to integrate changes? Can I help somehow? > TestFTPFileSystem failing as ftp server dir already exists > -- > > Key: HADOOP-17036 > URL: https://issues.apache.org/jira/browse/HADOOP-17036 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > Fix For: 3.4.0 > > > TestFTPFileSystem failing as the test dir exists. > need to delete in setup/teardown of each test case -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17036) TestFTPFileSystem failing as ftp server dir already exists
[ https://issues.apache.org/jira/browse/HADOOP-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17107263#comment-17107263 ] Mikhail Pryakhin commented on HADOOP-17036: --- Hey [~ste...@apache.org], the test runner reports success. Could you please check it out? Thank you! > TestFTPFileSystem failing as ftp server dir already exists > -- > > Key: HADOOP-17036 > URL: https://issues.apache.org/jira/browse/HADOOP-17036 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > TestFTPFileSystem failing as the test dir exists. > need to delete in setup/teardown of each test case -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-17036) TestFTPFileSystem failing as ftp server dir already exists
[ https://issues.apache.org/jira/browse/HADOOP-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105578#comment-17105578 ] Mikhail Pryakhin edited comment on HADOOP-17036 at 5/12/20, 5:12 PM: - here it is: [https://github.com/apache/hadoop/pull/2009] And now the test runner fails the build by the virtue of the following failed test. I'm currently investigating the reason as it passes locally {code:java} java.lang.AssertionError: Expected exactly one metric for name RpcServerExceptionNumOps expected:<1> but was:<0> at org.junit.Assert.fail(Assert.java:88) at org.junit.Assert.failNotEquals(Assert.java:834) at org.junit.Assert.assertEquals(Assert.java:645) at org.apache.hadoop.test.MetricsAsserts.checkCaptured(MetricsAsserts.java:278) at org.apache.hadoop.test.MetricsAsserts.getLongCounter(MetricsAsserts.java:237) at org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:231) at org.apache.hadoop.ipc.TestRPC.testCallsInternal(TestRPC.java:510) at org.apache.hadoop.ipc.TestRPC.testCalls(TestRPC.java:428) {code} was (Author: m.pryahin): here it is: [https://github.com/apache/hadoop/pull/2009] > TestFTPFileSystem failing as ftp server dir already exists > -- > > Key: HADOOP-17036 > URL: https://issues.apache.org/jira/browse/HADOOP-17036 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > TestFTPFileSystem failing as the test dir exists. > need to delete in setup/teardown of each test case -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105579#comment-17105579 ] Mikhail Pryakhin commented on HADOOP-14566: --- a pr is available here [https://github.com/apache/hadoop/pull/1999] > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-17036) TestFTPFileSystem failing as ftp server dir already exists
[ https://issues.apache.org/jira/browse/HADOOP-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17105578#comment-17105578 ] Mikhail Pryakhin commented on HADOOP-17036: --- here it is: [https://github.com/apache/hadoop/pull/2009] > TestFTPFileSystem failing as ftp server dir already exists > -- > > Key: HADOOP-17036 > URL: https://issues.apache.org/jira/browse/HADOOP-17036 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > TestFTPFileSystem failing as the test dir exists. > need to delete in setup/teardown of each test case -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-17036) TestFTPFileSystem failing as ftp server dir already exists
[ https://issues.apache.org/jira/browse/HADOOP-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-17036: -- Status: Patch Available (was: Open) patch available [https://github.com/apache/hadoop/pull/2009.patch] > TestFTPFileSystem failing as ftp server dir already exists > -- > > Key: HADOOP-17036 > URL: https://issues.apache.org/jira/browse/HADOOP-17036 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > TestFTPFileSystem failing as the test dir exists. > need to delete in setup/teardown of each test case -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-17036) TestFTPFileSystem failing as ftp server dir already exists
[ https://issues.apache.org/jira/browse/HADOOP-17036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin reassigned HADOOP-17036: - Assignee: Mikhail Pryakhin > TestFTPFileSystem failing as ftp server dir already exists > -- > > Key: HADOOP-17036 > URL: https://issues.apache.org/jira/browse/HADOOP-17036 > Project: Hadoop Common > Issue Type: Improvement > Components: fs, test >Affects Versions: 3.4.0 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > TestFTPFileSystem failing as the test dir exists. > need to delete in setup/teardown of each test case -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-14566: -- Status: Patch Available (was: In Progress) > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17101713#comment-17101713 ] Mikhail Pryakhin commented on HADOOP-14566: --- the patch is available [https://patch-diff.githubusercontent.com/raw/apache/hadoop/pull/1999.patch] > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-14566: -- Status: In Progress (was: Patch Available) > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin reassigned HADOOP-14566: - Assignee: Mikhail Pryakhin > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Assignee: Mikhail Pryakhin >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17100912#comment-17100912 ] Mikhail Pryakhin commented on HADOOP-14566: --- I've managed to implement both backward and forward lazy seeks as well as `AbstractContractSeekTest` for SFTP file system. > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-9713) FSDataInputStream.readFully doesn't work on filesystems without seek -even when the offset==getPos
[ https://issues.apache.org/jira/browse/HADOOP-9713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098266#comment-17098266 ] Mikhail Pryakhin edited comment on HADOOP-9713 at 5/3/20, 3:40 PM: --- Another option is to defer a seek call until the next: {code:java} FsDataInputstream#read(long position, byte[] buffer, int offset, int length){code} invocation, making it lazy. Normally the subsequent reads will proceed reading from the position where the previous read finished, meaning we can avoid making seek operations in this case. We will only need to seek when the current FsDataInputstream#getPos() != requested position. In standard read scenario, this will drastically reduce the number of seeks. was (Author: m.pryahin): Another option is to defer a seek call until the next {code:java} FsDataInputstream#read(long position, byte[] buffer, int offset, int length){code} invocation, making it lazy. Normally the subsequent reads will proceed reading from the position where the previous read finished, meaning we can avoid making seek operations in this case. We will only need to seek when the current FsDataInputstream#getPos() != requested position. In standard read scenario, this will drastically reduce the number of seeks. > FSDataInputStream.readFully doesn't work on filesystems without seek -even > when the offset==getPos > -- > > Key: HADOOP-9713 > URL: https://issues.apache.org/jira/browse/HADOOP-9713 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.1.0-beta, 1.3.0, 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > {{FSDataInputStream.readFully(offset,data)}} doesn't work even if the > offset==the current location -because it always seeks to the offset and seeks > back. No seek => Exception. > We could optimise {{FSDataInputStream.readFully(offset,data)}} to eliminate > the seeks on these operations -which would have tangible benefits for those > filesystems where seek is expensive (remote blobstores). It would also let > you use readFully against filesystems without seeks, provided you are only > reading from the current location. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HADOOP-9713) FSDataInputStream.readFully doesn't work on filesystems without seek -even when the offset==getPos
[ https://issues.apache.org/jira/browse/HADOOP-9713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098266#comment-17098266 ] Mikhail Pryakhin edited comment on HADOOP-9713 at 5/3/20, 3:39 PM: --- Another option is to defer a seek call until the next {code:java} FsDataInputstream#read(long position, byte[] buffer, int offset, int length){code} invocation, making it lazy. Normally the subsequent reads will proceed reading from the position where the previous read finished, meaning we can avoid making seek operations in this case. We will only need to seek when the current FsDataInputstream#getPos() != requested position. In standard read scenario, this will drastically reduce the number of seeks. was (Author: m.pryahin): Another option is to defer a seek call until the next `FsDataInputstream. read(long position, byte[] buffer, int offset, int length)` invocation, making it lazy. Normally the subsequent reads will proceed reading from the position where the previous read finished, meaning we can avoid making seek operations in this case. We will only need to seek when the current `FsDataInputstream.getPos() != requested position`. In standard read scenario, this will drastically reduce the number of seeks. > FSDataInputStream.readFully doesn't work on filesystems without seek -even > when the offset==getPos > -- > > Key: HADOOP-9713 > URL: https://issues.apache.org/jira/browse/HADOOP-9713 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.1.0-beta, 1.3.0, 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > {{FSDataInputStream.readFully(offset,data)}} doesn't work even if the > offset==the current location -because it always seeks to the offset and seeks > back. No seek => Exception. > We could optimise {{FSDataInputStream.readFully(offset,data)}} to eliminate > the seeks on these operations -which would have tangible benefits for those > filesystems where seek is expensive (remote blobstores). It would also let > you use readFully against filesystems without seeks, provided you are only > reading from the current location. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Assigned] (HADOOP-9713) FSDataInputStream.readFully doesn't work on filesystems without seek -even when the offset==getPos
[ https://issues.apache.org/jira/browse/HADOOP-9713?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin reassigned HADOOP-9713: Assignee: Mikhail Pryakhin > FSDataInputStream.readFully doesn't work on filesystems without seek -even > when the offset==getPos > -- > > Key: HADOOP-9713 > URL: https://issues.apache.org/jira/browse/HADOOP-9713 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.1.0-beta, 1.3.0, 3.0.0-alpha1 >Reporter: Steve Loughran >Assignee: Mikhail Pryakhin >Priority: Minor > > {{FSDataInputStream.readFully(offset,data)}} doesn't work even if the > offset==the current location -because it always seeks to the offset and seeks > back. No seek => Exception. > We could optimise {{FSDataInputStream.readFully(offset,data)}} to eliminate > the seeks on these operations -which would have tangible benefits for those > filesystems where seek is expensive (remote blobstores). It would also let > you use readFully against filesystems without seeks, provided you are only > reading from the current location. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-9713) FSDataInputStream.readFully doesn't work on filesystems without seek -even when the offset==getPos
[ https://issues.apache.org/jira/browse/HADOOP-9713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098266#comment-17098266 ] Mikhail Pryakhin commented on HADOOP-9713: -- Another option is to defer a seek call until the next `FsDataInputstream. read(long position, byte[] buffer, int offset, int length)` invocation, making it lazy. Normally the subsequent reads will proceed reading from the position where the previous read finished, meaning we can avoid making seek operations in this case. We will only need to seek when the current `FsDataInputstream.getPos() != requested position`. In standard read scenario, this will drastically reduce the number of seeks. > FSDataInputStream.readFully doesn't work on filesystems without seek -even > when the offset==getPos > -- > > Key: HADOOP-9713 > URL: https://issues.apache.org/jira/browse/HADOOP-9713 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.1.0-beta, 1.3.0, 3.0.0-alpha1 >Reporter: Steve Loughran >Priority: Minor > > {{FSDataInputStream.readFully(offset,data)}} doesn't work even if the > offset==the current location -because it always seeks to the offset and seeks > back. No seek => Exception. > We could optimise {{FSDataInputStream.readFully(offset,data)}} to eliminate > the seeks on these operations -which would have tangible benefits for those > filesystems where seek is expensive (remote blobstores). It would also let > you use readFully against filesystems without seeks, provided you are only > reading from the current location. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-14566) Add seek support for SFTP FileSystem
[ https://issues.apache.org/jira/browse/HADOOP-14566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098041#comment-17098041 ] Mikhail Pryakhin commented on HADOOP-14566: --- Hey [~ste...@apache.org] could I take over this issue? > Add seek support for SFTP FileSystem > > > Key: HADOOP-14566 > URL: https://issues.apache.org/jira/browse/HADOOP-14566 > Project: Hadoop Common > Issue Type: Improvement > Components: fs >Reporter: Azhagu Selvan SP >Priority: Minor > Attachments: HADOOP-14566.001.patch, HADOOP-14566.patch > > > This patch adds seek() method implementation for SFTP FileSystem and a unit > test for the same -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Commented] (HADOOP-9713) FSDataInputStream.readFully doesn't work on filesystems without seek -even when the offset==getPos
[ https://issues.apache.org/jira/browse/HADOOP-9713?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17098039#comment-17098039 ] Mikhail Pryakhin commented on HADOOP-9713: -- [~ste...@apache.org] That's a great Idea, but [the method JavaDoc claims|https://github.com/apache/hadoop/blob/ba66f3b454a5f6ea84f2cf7ac0082c555e2954a7/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/PositionedReadable.java#L59] that file offset is not changed after the method invocation. This means we have to seek back to the initial position to leave the file offset unchanged, don't we? > FSDataInputStream.readFully doesn't work on filesystems without seek -even > when the offset==getPos > -- > > Key: HADOOP-9713 > URL: https://issues.apache.org/jira/browse/HADOOP-9713 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 2.1.0-beta, 1.3.0, 3.0.0-alpha1 >Reporter: Steve Loughran >Priority: Minor > > {{FSDataInputStream.readFully(offset,data)}} doesn't work even if the > offset==the current location -because it always seeks to the offset and seeks > back. No seek => Exception. > We could optimise {{FSDataInputStream.readFully(offset,data)}} to eliminate > the seeks on these operations -which would have tangible benefits for those > filesystems where seek is expensive (remote blobstores). It would also let > you use readFully against filesystems without seeks, provided you are only > reading from the current location. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15358) SFTPConnectionPool connections leakage
[ https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-15358: -- Release Note: Fixed SFTPConnectionPool connections leakage Attachment: HADOOP-15358.001.patch Status: Patch Available (was: Open) Fixed SFTPConnectionPool connections leakage > SFTPConnectionPool connections leakage > -- > > Key: HADOOP-15358 > URL: https://issues.apache.org/jira/browse/HADOOP-15358 > Project: Hadoop Common > Issue Type: Bug > Components: fs >Affects Versions: 3.0.0 >Reporter: Mikhail Pryakhin >Assignee: Mikhail Pryakhin >Priority: Critical > Attachments: HADOOP-15358.001.patch > > > Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus > some methods of SFTPFileSystem are chained together resulting in establishing > multiple connections to the SFTP server to accomplish one compound action, > those methods are listed below: > # mkdirs method > the public mkdirs method acquires a new ChannelSftp from the pool [1] > and then recursively creates directories, checking for the directory > existence beforehand by calling the method exists[2] which delegates to the > getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it > ends up in returning the FilesStatus instance [4]. The resource leakage > occurs in the method getWorkingDirectory which calls the getHomeDirectory > method [5] which in turn establishes a new connection to the sftp server > instead of using an already created connection. As the mkdirs method is > recursive this results in creating a huge number of connections. > # open method [6]. This method returns an instance of FSDataInputStream > which consumes SFTPInputStream instance which doesn't return an acquired > ChannelSftp instance back to the pool but instead it closes it[7]. This leads > to establishing another connection to an SFTP server when the next method is > called on the FileSystem instance. > [1] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658 > [2] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321 > [3] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202 > [4] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290 > [5] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640 > [6] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504 > [7] > https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-15358) SFTPConnectionPool connections leakage
[ https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-15358: -- Description: Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some methods of SFTPFileSystem are chained together resulting in establishing multiple connections to the SFTP server to accomplish one compound action, those methods are listed below: # mkdirs method the public mkdirs method acquires a new ChannelSftp from the pool [1] and then recursively creates directories, checking for the directory existence beforehand by calling the method exists[2] which delegates to the getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it ends up in returning the FilesStatus instance [4]. The resource leakage occurs in the method getWorkingDirectory which calls the getHomeDirectory method [5] which in turn establishes a new connection to the sftp server instead of using an already created connection. As the mkdirs method is recursive this results in creating a huge number of connections. # open method [6]. This method returns an instance of FSDataInputStream which consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp instance back to the pool but instead it closes it[7]. This leads to establishing another connection to an SFTP server when the next method is called on the FileSystem instance. [1] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658 [2] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321 [3] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202 [4] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290 [5] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640 [6] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504 [7] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123 was: Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some methods of SFTPFileSystem are chained together resulting in establishing multiple connections to the SFTP server to accomplish one compound action, those methods are listed below: # mkdirs method the public mkdirs method acquires a new ChannelSftp [from the pool|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658]] and then recursively creates directories, checking for the directory existence beforehand by calling the method [exists|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321] ] which delegates to the getFileStatus(ChannelSftp channel, Path file) [method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202]] and so on until it ends up in returning the [FilesStatus instance|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290]]. The resource leakage occurs in the method getWorkingDirectory which calls the getHomeDirectory [method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640]] which in turn establishes a new connection to the sftp server instead of using an already created connection. As the mkdirs method is recursive this results in creating a huge number of connections. # open [method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504]]. This method returns an instance of FSDataInputStream which consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp instanc
[jira] [Updated] (HADOOP-15358) SFTPConnectionPool connections leakage
[ https://issues.apache.org/jira/browse/HADOOP-15358?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mikhail Pryakhin updated HADOOP-15358: -- Description: Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some methods of SFTPFileSystem are chained together resulting in establishing multiple connections to the SFTP server to accomplish one compound action, those methods are listed below: # mkdirs method the public mkdirs method acquires a new ChannelSftp [from the pool|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658]] and then recursively creates directories, checking for the directory existence beforehand by calling the method [exists|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321] ] which delegates to the getFileStatus(ChannelSftp channel, Path file) [method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202]] and so on until it ends up in returning the [FilesStatus instance|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290]]. The resource leakage occurs in the method getWorkingDirectory which calls the getHomeDirectory [method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640]] which in turn establishes a new connection to the sftp server instead of using an already created connection. As the mkdirs method is recursive this results in creating a huge number of connections. # open [method|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504]]. This method returns an instance of FSDataInputStream which consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp instance back to the pool but instead it [closes|[https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123]] it. This leads to establishing another connection to an SFTP server when the next method is called on the FileSystem instance. was: Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some methods of SFTPFileSystem are chained together resulting in establishing multiple connections to the SFTP server to accomplish one compound action, those methods are listed below: # mkdirs method the public mkdirs method acquires a new ChannelSftp from the pool [1] and then recursively creates directories, checking for the directory existence beforehand by calling the method exists[2] which delegates to the getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it ends up in returning the FilesStatus instance [4]. The resource leakage occurs in the method getWorkingDirectory which calls the getHomeDirectory method [5] which in turn establishes a new connection to the sftp server instead of using an already created connection. As the mkdirs method is recursive this results in creating a huge number of connections. # open method [6] This method returns an instance of FSDataInputStream which consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp instance back to the pool but instead it closes it[7]. This leads to establishing another connection to an SFTP server when the next method is called on the FileSystem instance. [1] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658 [2] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321 [3] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202 [4] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290 [5] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640 [6] https:
[jira] [Created] (HADOOP-15358) SFTPConnectionPool connections leakage
Mikhail Pryakhin created HADOOP-15358: - Summary: SFTPConnectionPool connections leakage Key: HADOOP-15358 URL: https://issues.apache.org/jira/browse/HADOOP-15358 Project: Hadoop Common Issue Type: Bug Components: fs Affects Versions: 3.0.0 Reporter: Mikhail Pryakhin Methods of SFTPFileSystem operate on poolable ChannelSftp instances, thus some methods of SFTPFileSystem are chained together resulting in establishing multiple connections to the SFTP server to accomplish one compound action, those methods are listed below: # mkdirs method the public mkdirs method acquires a new ChannelSftp from the pool [1] and then recursively creates directories, checking for the directory existence beforehand by calling the method exists[2] which delegates to the getFileStatus(ChannelSftp channel, Path file) method [3] and so on until it ends up in returning the FilesStatus instance [4]. The resource leakage occurs in the method getWorkingDirectory which calls the getHomeDirectory method [5] which in turn establishes a new connection to the sftp server instead of using an already created connection. As the mkdirs method is recursive this results in creating a huge number of connections. # open method [6] This method returns an instance of FSDataInputStream which consumes SFTPInputStream instance which doesn't return an acquired ChannelSftp instance back to the pool but instead it closes it[7]. This leads to establishing another connection to an SFTP server when the next method is called on the FileSystem instance. [1] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L658 [2] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L321 [3] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L202 [4] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L290 [5] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L640 [6] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPFileSystem.java#L504 [7] https://github.com/apache/hadoop/blob/736ceab2f58fb9ab5907c5b5110bd44384038e6b/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/sftp/SFTPInputStream.java#L123 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org