[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-06-06 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13044793#comment-13044793
 ] 

Hudson commented on HDFS-1965:
--

Integrated in Hadoop-Hdfs-22-branch #61 (See 
[https://builds.apache.org/hudson/job/Hadoop-Hdfs-22-branch/61/])


> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-23 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038076#comment-13038076
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1965:
--

Okay, I fine with it since it is only a temporary fix.

+1 the 0.22 patch looks good.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-23 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038072#comment-13038072
 ] 

Todd Lipcon commented on HDFS-1965:
---

I think in trunk, it's not possible, since the connection is only lazily opened 
by the actual RPC to the DataNode. Then, it won't close since there's a call 
outstanding.

In 0.22, it's possible that it will open one connection for the 
getProtocolVersion() call and a second one for the actual RPC. Unless I'm 
missing something, that should only be an efficiency issue and not a 
correctness issue. Do you agree?

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-23 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13038055#comment-13038055
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1965:
--

Came up a question: By setting maxidletime to 0, is there a race condition that 
the timeout occurs before the first call, i.e. the proxy is closed before the 
first call?

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-22 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037684#comment-13037684
 ] 

Todd Lipcon commented on HDFS-1965:
---

Nicholas: can you please take a quick look at the 0.22 patch?

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-21 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037355#comment-13037355
 ] 

Hudson commented on HDFS-1965:
--

Integrated in Hadoop-Hdfs-trunk #673 (See 
[https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk/673/])
HDFS-1965. IPCs done using block token-based tickets can't reuse 
connections. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1125605
Files : 
* /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/DFSClient.java
* /hadoop/hdfs/trunk/CHANGES.txt


> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037215#comment-13037215
 ] 

Hadoop QA commented on HDFS-1965:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479975/hdfs-1965-0.22.txt
  against trunk revision 1125605.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/607//console

This message is automatically generated.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037214#comment-13037214
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1965:
--

Hey Todd, please wait for Hadoop QA before committing the patch.  It sometimes 
catches unexpected problems.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965-0.22.txt, hdfs-1965.txt, hdfs-1965.txt, 
> hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037209#comment-13037209
 ] 

Hadoop QA commented on HDFS-1965:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479967/hdfs-1965.txt
  against trunk revision 1125217.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 6 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/605//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/605//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/605//console

This message is automatically generated.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037205#comment-13037205
 ] 

Hudson commented on HDFS-1965:
--

Integrated in Hadoop-Hdfs-trunk-Commit #677 (See 
[https://builds.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/677/])
HDFS-1965. IPCs done using block token-based tickets can't reuse 
connections. Contributed by Todd Lipcon.

todd : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1125605
Files : 
* /hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/DFSTestUtil.java
* 
/hadoop/hdfs/trunk/src/test/hdfs/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java
* /hadoop/hdfs/trunk/src/java/org/apache/hadoop/hdfs/DFSClient.java
* /hadoop/hdfs/trunk/CHANGES.txt


> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037195#comment-13037195
 ] 

Todd Lipcon commented on HDFS-1965:
---

Committed to trunk after re-running the test.

It doesn't apply directly to 0.22. Let me format a patch there and upload it 
soon.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037113#comment-13037113
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1965:
--

Okay, you mean this is a temporary fix.  Sounds good.  Some comments on the 
patch:

- Instead of changing it to public, we could create add a utility method, say 
in {{DFSTestUtil}}, for invoking the package private method.
{code}
+  /** Public only for tests */
+  public static ClientDatanodeProtocol createClientDatanodeProtocolProxy(
{code}

- How about putting {{confWithNoIpcIdle}} as a member field?

- Please use 
{{CommonConfigurationKeysPublic.IPC_CLIENT_CONNECTION_MAXIDLETIME_KEY}} for 
"ipc.client.connection.maxidletime".

- Please add a comment saying that this is a temporary fix and the 
corresponding codes should be removed once {{stopProxy(..)}} is fixed.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037104#comment-13037104
 ] 

Todd Lipcon commented on HDFS-1965:
---

bq. Todd, just saw you comments. I think this is the real bug: we should fix 
stopProxy(..) instead of changing max idle time.

Yes, you're probably right. But maybe we can use this as a stop-gap for 0.22 
while we work on the stopProxy fix in trunk? I'm afraid the stopProxy stuff 
will be complicated - that IPC code is kind of spaghetti.


> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037095#comment-13037095
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1965:
--

> Turns out the reason that RPC.stopProxy isn't effective in "real life" is 
> that the WritableRpcEngine "Client" objects are cached in ClientCache with 
> keys that aren't tied to principals. So, stopProxy doesn't actually cause the 
> connection to disconnect. I'm not sure if that's a bug or by design.

Todd, just saw you comments.  I think this is the real bug: we should fix 
{{stopProxy(..)}} instead of changing max idle time.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Tsz Wo (Nicholas), SZE (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13037090#comment-13037090
 ] 

Tsz Wo (Nicholas), SZE commented on HDFS-1965:
--

It seems that the reasons of {{TestFileConcurrentReader}} failing are:
- The test open many files within a short period of time, says in a few seconds.
- {{DFSClient}} creates a proxy for each open.
- Since the default ipc.client.connection.maxidletime is 10 seconds, so the 
proxies are not yet closed.
- Therefore, {{TestFileConcurrentReader}} fails with runtime exceptions (out of 
descriptors?)

Todd, do you agree?

*Questions*: We already have {{RPC.stopProxy(cdp)}} in a finally-block.  Why 
the resource is still not released?  Is it because {{TestFileConcurrentReader}} 
opens files so fast that the finally-block is not yet reached?  Or 
{{RPC.stopProxy(..)}} does not work?

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036704#comment-13036704
 ] 

Hadoop QA commented on HDFS-1965:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479878/hdfs-1965.txt
  against trunk revision 1125217.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/599//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/599//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/599//console

This message is automatically generated.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt, hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-19 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036614#comment-13036614
 ] 

Hadoop QA commented on HDFS-1965:
-

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12479859/hdfs-1965.txt
  against trunk revision 1125145.

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

+1 javadoc.  The javadoc tool did not generate any warning messages.

-1 javac.  The applied patch generated 29 javac compiler warnings (more 
than the trunk's current 28 warnings).

+1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

-1 core tests.  The patch failed these core unit tests:
  org.apache.hadoop.hdfs.TestDFSStorageStateRecovery
  org.apache.hadoop.hdfs.TestHDFSTrash

+1 contrib tests.  The patch passed contrib unit tests.

+1 system test framework.  The patch passed system test framework compile.

Test results: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/596//testReport/
Findbugs warnings: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/596//artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Console output: 
https://builds.apache.org/hudson/job/PreCommit-HDFS-Build/596//console

This message is automatically generated.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
> Attachments: hdfs-1965.txt
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036554#comment-13036554
 ] 

Todd Lipcon commented on HDFS-1965:
---

I implemented option (b) and have a test case that shows that it fixes the 
problem...

BUT: the real DFSInputStream code seems to call RPC.stopProxy() after it uses 
the proxy, which should also avoid this issue. Doing so in my test case makes 
the case pass without any other fix. So there's still some mystery.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-1965) IPCs done using block token-based tickets can't reuse connections

2011-05-19 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-1965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13036531#comment-13036531
 ] 

Todd Lipcon commented on HDFS-1965:
---

I can think of a couple possible solutions:

a) make the methods that operate on a block take an additional parameter to 
contain block tokens, rather than using the normal token selector mechanism 
that scopes credentials on a per-connection basis. This has the advantage that 
we can even re-use an IPC connection across different blocks.

b) when the client creates an IPC proxy to a DN, it can explicitly configure 
the maxIdleTime to 0 so that we don't leave connections hanging around after 
the call completes. This is less efficient than option A above, but it probably 
doesn't matter much for this use case.

> IPCs done using block token-based tickets can't reuse connections
> -
>
> Key: HDFS-1965
> URL: https://issues.apache.org/jira/browse/HDFS-1965
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: security
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.22.0
>
>
> This is the reason that TestFileConcurrentReaders has been failing a lot. 
> Reproducing a comment from HDFS-1057:
> The test has a thread which continually re-opens the file which is being 
> written to. Since the file's in the middle of being written, it makes an RPC 
> to the DataNode in order to determine the visible length of the file. This 
> RPC is authenticated using the block token which came back in the 
> LocatedBlocks object as the security ticket.
> When this RPC hits the IPC layer, it looks at its existing connections and 
> sees none that can be re-used, since the block token differs between the two 
> requesters. Hence, it reconnects, and we end up with hundreds or thousands of 
> IPC connections to the datanode.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira