[jira] [Updated] (HDFS-10834) Add concat to libhdfs API

2016-09-02 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-10834:
-
Status: Patch Available  (was: Open)

> Add concat to libhdfs API
> -
>
> Key: HDFS-10834
> URL: https://issues.apache.org/jira/browse/HDFS-10834
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, libhdfs
>Reporter: Gary Helmling
> Attachments: HDFS-10834.001.patch
>
>
> libhdfs does not currently provide access to calling FileSystem.concat().  
> Let's add a function for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10834) Add concat to libhdfs API

2016-09-02 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-10834:
-
Attachment: HDFS-10834.001.patch

Here is a patch from [~sunchensamurai] adding hdfsConcat to the libhdfs API and 
a test for it.

Can someone with proper access please assign the issue to him, so that he can 
make any further updates?

> Add concat to libhdfs API
> -
>
> Key: HDFS-10834
> URL: https://issues.apache.org/jira/browse/HDFS-10834
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, libhdfs
>Reporter: Gary Helmling
> Attachments: HDFS-10834.001.patch
>
>
> libhdfs does not currently provide access to calling FileSystem.concat().  
> Let's add a function for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-10834) Add concat to libhdfs API

2016-09-02 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-10834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-10834:
-
Component/s: hdfs

> Add concat to libhdfs API
> -
>
> Key: HDFS-10834
> URL: https://issues.apache.org/jira/browse/HDFS-10834
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: hdfs, libhdfs
>Reporter: Gary Helmling
>
> libhdfs does not currently provide access to calling FileSystem.concat().  
> Let's add a function for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Created] (HDFS-10834) Add concat to libhdfs API

2016-09-02 Thread Gary Helmling (JIRA)
Gary Helmling created HDFS-10834:


 Summary: Add concat to libhdfs API
 Key: HDFS-10834
 URL: https://issues.apache.org/jira/browse/HDFS-10834
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: libhdfs
Reporter: Gary Helmling


libhdfs does not currently provide access to calling FileSystem.concat().  
Let's add a function for it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-07-07 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Attachment: HDFS-9805.branch-2.001.patch

[~cmccabe] Here is a patch backported to branch-2.  It should apply cleanly.  
Do I need to re-open the issue to get the Yetus build to run?

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Fix For: 3.0.0-alpha1
>
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch, 
> HDFS-9805.004.patch, HDFS-9805.005.patch, HDFS-9805.branch-2.001.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-06-29 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356261#comment-15356261
 ] 

Gary Helmling commented on HDFS-9805:
-

For the test failures:
* TestBlockScanner passes for me locally
* TestNameNodeMetadataConsistency seems flaky.  It failed 1 time out of 10 for 
me locally.
* TestOfflineEditsViewer is failing in trunk

[~cmccabe], [~jojochuang]: could you take a look at the latest patch?

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch, 
> HDFS-9805.004.patch, HDFS-9805.005.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-06-27 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Attachment: HDFS-9805.005.patch

Another update to test and checkstyle issues:

* adds dfs.data.transfer.server.tcpnodelay to hdfs-default.xml to fix 
TestHdfsConfigFields
* fixes checkstyle line length issues

Of the other reported test failures:
* TestOpenFilesWithSnapshot, TestRollingFileSystemSinkWithHdfs both pass for me 
locally
* TestOfflineEditsViewer seems to be already failing on trunk

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch, 
> HDFS-9805.004.patch, HDFS-9805.005.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-06-27 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Attachment: HDFS-9805.004.patch

Updated patch adding:

* a new configuration property (dfs.data.transfer.server.tcpnodelay), 
defaulting to true, controlling the TCP_NODELAY setting in the DN -> DN 
transfer path
* a test case check the TCP_NODELAY was enabled on all used sockets when the 
relevant config settings are enabled

Note that I had to modify {{DataNode#newSocket()}} in this patch in order to 
support the test case.  Prior to this, {{newSocket()}} was not using the 
configured socket factory, instead creating sockets directly.  This seems like 
a change we would want anyway, just calling in out.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch, 
> HDFS-9805.004.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Issue Comment Deleted] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-06-06 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Comment: was deleted

(was: [~jojochuang], I have a version that adds in the config already.  I'll 
take a stab at the unit test and post an update tomorrow.)

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-06-05 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316210#comment-15316210
 ] 

Gary Helmling commented on HDFS-9805:
-

[~jojochuang], I have a version that adds in the config already.  I'll take a 
stab at the unit test and post an update tomorrow.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-06-05 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15316208#comment-15316208
 ] 

Gary Helmling commented on HDFS-9805:
-

[~jojochuang], I have a version that adds in the config already.  I'll take a 
stab at the unit test and post an update tomorrow.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-04-12 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15238168#comment-15238168
 ] 

Gary Helmling commented on HDFS-9805:
-

[~cmccabe], is {{ipc.server.tcpnodelay}} the right config to use?  Or do we 
need a {{dfs.data.transfer.server.tcpnodelay}} config similar to the 
{{dfs.data.transfer.client.tcpnodelay}} added for HDFS-9700?  I don't want to 
introduce another config, but since these changes all seem to be on the data 
transfer side, that seems consistent.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-03-31 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15220852#comment-15220852
 ] 

Gary Helmling commented on HDFS-9805:
-

[~cmccabe] I've updated the patch against the latest trunk.  I can do a 
branch-2.8 version as well if that helps.  Please let me know what you think.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-03-31 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Attachment: HDFS-9805.003.patch

New patch rebased against the latest trunk.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch, HDFS-9805.003.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-03-28 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Attachment: HDFS-9805.002.patch

Here's a new patch rebased against the latest trunk.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.002.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-02-24 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Target Version/s: 2.8.0

Ping. Any takers for this change?  It's pretty straightforward, though I can 
add a separate config for it if necessary.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.001.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-02-18 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15153064#comment-15153064
 ] 

Gary Helmling commented on HDFS-9805:
-

The checkstyle warning is on the {{DataXceiver::writeBlock()}} method being 236 
lines long, greater than max allowed of 150.  This patch adds a single line to 
{{writeBlock()}}, which I think triggered the warning.  But I don't think it 
makes sense to refactor the method as part of this issue.

For java 8, the following tests failed or timed out:
{noformat}
Failed tests: 
  TestDatanodeRegistration.testForcedRegistration:382 null
  TestHAAppend.testMultipleAppendsDuringCatchupTailing:125 inode should 
complete in ~6 ms.
Expected: is 
 but: was 
  TestStandbyCheckpoints.testCheckpointCancellationDuringUpload:347 
expected:<0> but was:<108>

Tests in error: 
  TestDelegationTokenForProxyUser.testWebHdfsDoAs:161 ?  test timed out after 
50...
  
TestDirectoryScanner.testThrottling:584->createFile:108->Object.wait:502->Object.wait:-2
 ? 
  
TestEditLogTailer.testNN1TriggersLogRolls:115->testStandbyTriggersLogRolls:147->waitForLogRollInSharedDir:168
 ? Timeout
  
TestSecureNNWithQJM.testSecureMode:167->doNNWithQJMTest:187->restartNameNode:205->Object.wait:502->Object.wait:-2
 ? 
  
TestSecureNNWithQJM.testSecondaryNameNodeHttpAddressNotNeeded:173->doNNWithQJMTest:193->restartNameNode:205
 ? EditLogInput
{noformat}

All of these pass for me locally.

For java 7, the following tests failed or timed out:
{noformat}
Failed tests: 
  TestDataNodeVolumeFailure.testUnderReplicationAfterVolFailure:412 There is no 
under replicated block after volume failure
  TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit:436 The block 
should be only on 1 datanode  expected:<1> but was:<2>
  TestNameNodeMetadataConsistency.testGenerationStampInFuture:113 expected:<17> 
but was:<0>
  TestHAAppend.testMultipleAppendsDuringCatchupTailing:125 inode should 
complete in ~6 ms.
Expected: is 
 but: was 

Tests in error: 
  
TestRollingFileSystemSinkWithSecureHdfs.testMissingPropertiesWithSecureHDFS:145->createDirectoriesSecurely:191
 ? IO
  
TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten:609->testRemoveVolumeBeingWrittenForDatanode:686
 ? Timeout
  TestDataNodeMultipleRegistrations.testClusterIdMismatchAtStartupWithHA:253 ?  
...
  
TestNNHandlesCombinedBlockReport>BlockReportTestBase.testOneReplicaRbwReportArrivesAfterBlockCompleted:630
 ? 
  
TestDirectoryScanner.testThrottling:584->createFile:108->Object.wait:503->Object.wait:-2
 ? 
  TestSecureNameNode.testName:65 ? IO Failed on local exception: 
java.io.IOExcep...
  TestFileTruncate.testTruncateWithDataNodesRestart:704 ? Timeout Timed out 
wait...
  TestEncryptionZones.testStartFileRetry:1067 ?  test timed out after 12 
mil...
{noformat}

All of these pass locally for me with java 7 as well. 

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.001.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-02-16 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Status: Patch Available  (was: Open)

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.001.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-02-16 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9805:

Attachment: HDFS-9805.001.patch

Attaching a very simple patch for trunk, which always sets TCP_NODELAY for 
these paths, following what is done in {{DFSUtilClient::peerFromSocket()}}.

Alternately, since HDFS-9700 added a new config option 
{{dfs.data.transfer.client.tcpnodelay}} (defaulting to true), we could do 
something similar here and introduce a new config option (ie. 
{{dfs.data.transfer.tcpnodelay}}) for these cases.

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9805.001.patch
>
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-02-16 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15149314#comment-15149314
 ] 

Gary Helmling commented on HDFS-9805:
-

In trunk, it looks like there is an additional location where this happens:
* {{ErasureCodingWorker.ReconstructAndTransferBlock::initTargetStreams()}}

> TCP_NODELAY not set before SASL handshake in data transfer pipeline
> ---
>
> Key: HDFS-9805
> URL: https://issues.apache.org/jira/browse/HDFS-9805
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode
>Reporter: Gary Helmling
>Assignee: Gary Helmling
>
> There are a few places in the DN -> DN block transfer pipeline where 
> TCP_NODELAY is not set before doing a SASL handshake:
> * in {{DataNode.DataTransfer::run()}}
> * in {{DataXceiver::replaceBlock()}}
> * in {{DataXceiver::writeBlock()}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9805) TCP_NODELAY not set before SASL handshake in data transfer pipeline

2016-02-12 Thread Gary Helmling (JIRA)
Gary Helmling created HDFS-9805:
---

 Summary: TCP_NODELAY not set before SASL handshake in data 
transfer pipeline
 Key: HDFS-9805
 URL: https://issues.apache.org/jira/browse/HDFS-9805
 Project: Hadoop HDFS
  Issue Type: Improvement
  Components: datanode
Reporter: Gary Helmling
Assignee: Gary Helmling


There are a few places in the DN -> DN block transfer pipeline where 
TCP_NODELAY is not set before doing a SASL handshake:

* in {{DataNode.DataTransfer::run()}}
* in {{DataXceiver::replaceBlock()}}
* in {{DataXceiver::writeBlock()}}




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-10 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9700:

Attachment: HDFS-9700.004.patch

Updated patch for trunk addressing [~iwasakims] last comment.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700.004.patch, 
> HDFS-9700_branch-2.7-v2.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-03 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15131152#comment-15131152
 ] 

Gary Helmling commented on HDFS-9700:
-

Thanks for taking a look.  I'll rename the config key and variable names.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, HDFS-9700-v1.patch, 
> HDFS-9700-v2.patch, HDFS-9700.002.patch, HDFS-9700_branch-2.7-v2.patch, 
> HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-03 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9700:

Attachment: HDFS-9700.003.patch
HDFS-9700-branch-2.7.003.patch

Updated patch renaming config key to "dfs.data.transfer.client.tcpnodelay" to 
clearly reflect this is only associated with the data transfer protocol client.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, 
> HDFS-9700-branch-2.7.003.patch, HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700.002.patch, HDFS-9700.003.patch, HDFS-9700_branch-2.7-v2.patch, 
> HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-03 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9700:

Attachment: HDFS-9700.002.patch
HDFS-9700-branch-2.7.002.patch

Renaming and reattaching the patches from yesterday in an attempt to make yetus 
happy.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-branch-2.7.002.patch, HDFS-9700-v1.patch, 
> HDFS-9700-v2.patch, HDFS-9700.002.patch, HDFS-9700_branch-2.7-v2.patch, 
> HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-02-02 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9700:

Attachment: HDFS-9700_branch-2.7-v2.patch
HDFS-9700-v2.patch

Here are updated patches for trunk and branch-2.7.  These change the setting to 
use a new config key: "dfs.client.socket.tcpnodelay".  This defaults to true, 
since HADOOP-8069 has already update the default for 
ipc.(client|server).tcpnodelay to true.

I'll see if I can do some basic benchmarks with TestDFSIO with and without the 
setting.  I don't expect this to make things worse, but neither do I expect 
much improvement in a throughput test like TestDFSIO.

Where we do see a difference from this is in the P95 write latencies for our 
HBase clusters with HDFS security enabled.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700-v2.patch, 
> HDFS-9700_branch-2.7-v2.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-01-29 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15124522#comment-15124522
 ] 

Gary Helmling commented on HDFS-9700:
-

bq. Should we use another key or always set TCP_NODELAY as 
DFSUtilClient#peerFromSocket do, if it is crucial?

Sure, I'm happy to just always call {{setTcpNoDelay(true)}} on the sockets 
created, similar to what is done in {{peerFromSocket}}.  Should I do that for 
both cases, or should the one in {{DFSClient#connectToDN}} still be governed by 
the client config key?

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Assigned] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-01-28 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling reassigned HDFS-9700:
---

Assignee: Gary Helmling

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
>Assignee: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-01-25 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9700:

Attachment: HDFS-9700_branch-2.7.patch

The attached patch is against branch-2.7.  For an HBase deployment on secure 
Hadoop, this reliably lowers our P95 write latencies from 40ms+ to ~2ms.

I'm still working out how/if the same changes apply to trunk.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
> Attachments: HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-01-25 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9700:

Attachment: HDFS-9700-v1.patch

Attaching a patch for the same changes against trunk.

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.7.1, 2.6.3
>Reporter: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-01-25 Thread Gary Helmling (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-9700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gary Helmling updated HDFS-9700:

Status: Patch Available  (was: Open)

> DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots
> 
>
> Key: HDFS-9700
> URL: https://issues.apache.org/jira/browse/HDFS-9700
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 2.6.3, 2.7.1
>Reporter: Gary Helmling
> Attachments: HDFS-9700-v1.patch, HDFS-9700_branch-2.7.patch
>
>
> In {{DFSClient.connectToDN()}} and 
> {{DFSOutputStream.createSocketForPipeline()}}, we never call 
> {{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
> we should respect the value of ipc.client.tcpnodelay in the configuration.
> While this applies whether security is enabled or not, it seems to have a 
> bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (HDFS-9700) DFSClient and DFSOutputStream do not respect TCP_NODELAY config in two spots

2016-01-25 Thread Gary Helmling (JIRA)
Gary Helmling created HDFS-9700:
---

 Summary: DFSClient and DFSOutputStream do not respect TCP_NODELAY 
config in two spots
 Key: HDFS-9700
 URL: https://issues.apache.org/jira/browse/HDFS-9700
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: 2.6.3, 2.7.1
Reporter: Gary Helmling


In {{DFSClient.connectToDN()}} and 
{{DFSOutputStream.createSocketForPipeline()}}, we never call 
{{setTcpNoDelay()}} on the constructed socket before sending.  In both cases, 
we should respect the value of ipc.client.tcpnodelay in the configuration.

While this applies whether security is enabled or not, it seems to have a 
bigger impact on latency when security is enabled.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9669) TcpPeerServer should respect ipc.server.listen.queue.size

2016-01-20 Thread Gary Helmling (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109138#comment-15109138
 ] 

Gary Helmling commented on HDFS-9669:
-

Patch looks good to me (non-binding).  Seems like it closes the only hole in 
applying ipc.server.listen.queue.size.

> TcpPeerServer should respect ipc.server.listen.queue.size
> -
>
> Key: HDFS-9669
> URL: https://issues.apache.org/jira/browse/HDFS-9669
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.2
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: HDFS-9669.0.patch, HDFS-9669.1.patch, HDFS-9669.1.patch
>
>
> On periods of high traffic we are seeing:
> {code}
> 16/01/19 23:40:40 WARN hdfs.DFSClient: Connection failure: Failed to connect 
> to /10.138.178.47:50010 for file /MYPATH/MYFILE for block 
> BP-1935559084-10.138.112.27-1449689748174:blk_1080898601_7375294:java.io.IOException:
>  Connection reset by peer
> java.io.IOException: Connection reset by peer
>   at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>   at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>   at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93)
>   at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>   at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471)
>   at 
> org.apache.hadoop.net.SocketOutputStream$Writer.performIO(SocketOutputStream.java:63)
>   at 
> org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:159)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:117)
>   at 
> org.apache.hadoop.net.SocketOutputStream.write(SocketOutputStream.java:109)
>   at java.io.DataOutputStream.writeInt(DataOutputStream.java:197)
> {code}
> At the time that this happens there are way less xceivers than configured.
> On most JDK's this will make 50 the total backlog at any time. This 
> effectively means that any GC + Busy time willl result in tcp resets.
> http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/tip/src/share/classes/java/net/ServerSocket.java#l370



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)