[jira] [Updated] (HDFS-4227) Document dfs.namenode.resource.*

2012-12-14 Thread Daisuke Kobayashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daisuke Kobayashi updated HDFS-4227:


Attachment: HDFS-4227.patch

new patch attached. Can you review?

> Document dfs.namenode.resource.*  
> --
>
> Key: HDFS-4227
> URL: https://issues.apache.org/jira/browse/HDFS-4227
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Daisuke Kobayashi
>  Labels: newbie
> Attachments: HDFS-4227.patch
>
>
> Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section 
> the in the HDFS docs that covers local directories.
> {{dfs.namenode.resource.check.interval}} - the interval in ms at which the 
> NameNode resource checker runs (default is 5000)
> {{dfs.namenode.resource.du.reserved}} - the amount of space to 
> reserve/require for a NN storage directory (default is 100mb)
> {{dfs.namenode.resource.checked.volumes}} - a list of local directories for 
> the NN resource checker to check in addition to the local edits directories 
> (default is empty).
> {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of 
> redundant NN storage volumes required (default is 1). If no redundant 
> resources are available we don't enter SM if there are sufficient required 
> resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Assigned] (HDFS-4227) Document dfs.namenode.resource.*

2012-12-14 Thread Daisuke Kobayashi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Daisuke Kobayashi reassigned HDFS-4227:
---

Assignee: Daisuke Kobayashi

> Document dfs.namenode.resource.*  
> --
>
> Key: HDFS-4227
> URL: https://issues.apache.org/jira/browse/HDFS-4227
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: documentation
>Affects Versions: 1.0.0, 2.0.0-alpha
>Reporter: Eli Collins
>Assignee: Daisuke Kobayashi
>  Labels: newbie
>
> Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section 
> the in the HDFS docs that covers local directories.
> {{dfs.namenode.resource.check.interval}} - the interval in ms at which the 
> NameNode resource checker runs (default is 5000)
> {{dfs.namenode.resource.du.reserved}} - the amount of space to 
> reserve/require for a NN storage directory (default is 100mb)
> {{dfs.namenode.resource.checked.volumes}} - a list of local directories for 
> the NN resource checker to check in addition to the local edits directories 
> (default is empty).
> {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of 
> redundant NN storage volumes required (default is 1). If no redundant 
> resources are available we don't enter SM if there are sufficient required 
> resources.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data

2012-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532938#comment-13532938
 ] 

Hadoop QA commented on HDFS-347:


{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12561092/HDFS-347.027.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 14 new 
or modified test files.

  {color:red}-1 javac{color}.  The applied patch generated 2013 javac 
compiler warnings (more than the trunk's current 2012 warnings).

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3669//testReport/
Javac warnings: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3669//artifact/trunk/patchprocess/diffJavacWarnings.txt
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3669//console

This message is automatically generated.

> DFS read performance suboptimal when client co-located on nodes with data
> -
>
> Key: HDFS-347
> URL: https://issues.apache.org/jira/browse/HDFS-347
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs-client, performance
>Reporter: George Porter
>Assignee: Colin Patrick McCabe
> Attachments: all.tsv, BlockReaderLocal1.txt, HADOOP-4801.1.patch, 
> HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-016_cleaned.patch, 
> HDFS-347.016.patch, HDFS-347.017.clean.patch, HDFS-347.017.patch, 
> HDFS-347.018.clean.patch, HDFS-347.018.patch2, HDFS-347.019.patch, 
> HDFS-347.020.patch, HDFS-347.021.patch, HDFS-347.022.patch, 
> HDFS-347.024.patch, HDFS-347.025.patch, HDFS-347.026.patch, 
> HDFS-347.027.patch, HDFS-347-branch-20-append.txt, hdfs-347.png, 
> hdfs-347.txt, local-reads-doc
>
>
> One of the major strategies Hadoop uses to get scalable data processing is to 
> move the code to the data.  However, putting the DFS client on the same 
> physical node as the data blocks it acts on doesn't improve read performance 
> as much as expected.
> After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem 
> is due to the HDFS streaming protocol causing many more read I/O operations 
> (iops) than necessary.  Consider the case of a DFSClient fetching a 64 MB 
> disk block from the DataNode process (running in a separate JVM) running on 
> the same machine.  The DataNode will satisfy the single disk block request by 
> sending data back to the HDFS client in 64-KB chunks.  In BlockSender.java, 
> this is done in the sendChunk() method, relying on Java's transferTo() 
> method.  Depending on the host O/S and JVM implementation, transferTo() is 
> implemented as either a sendfilev() syscall or a pair of mmap() and write().  
> In either case, each chunk is read from the disk by issuing a separate I/O 
> operation for each chunk.  The result is that the single request for a 64-MB 
> block ends up hitting the disk as over a thousand smaller requests for 64-KB 
> each.
> Since the DFSClient runs in a different JVM and process than the DataNode, 
> shuttling data from the disk to the DFSClient also results in context 
> switches each time network packets get sent (in this case, the 64-kb chunk 
> turns into a large number of 1500 byte packet send operations).  Thus we see 
> a large number of context switches for each block send operation.
> I'd like to get some feedback on the best way to address this, but I think 
> providing a mechanism for a DFSClient to directly open data blocks that 
> happen to be on the same machine.  It could do this by examining the set of 
> LocatedBlocks returned by the NameNode, marking those that should be resident 
> on the local host.  Since the DataNode and DFSClient (probably) share the 
> same hadoop configuration, the DFSClient should be able to find the files 
> holding the block data, and it could directly open them and send data back to 
> the client.  This would

[jira] [Commented] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access

2012-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532928#comment-13532928
 ] 

Hadoop QA commented on HDFS-4315:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12561073/HDFS-4315.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3668//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3668//console

This message is automatically generated.

> DNs with multiple BPs can have BPOfferServices fail to start due to 
> unsynchronized map access
> -
>
> Key: HDFS-4315
> URL: https://issues.apache.org/jira/browse/HDFS-4315
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.2-alpha
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-4315.patch
>
>
> In some nightly test runs we've seen pretty frequent failures of 
> TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an 
> unsynchronized map access in the DataStorage class.
> More details in the first comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data

2012-12-14 Thread Colin Patrick McCabe (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Colin Patrick McCabe updated HDFS-347:
--

Attachment: HDFS-347.027.patch

This doesn't address all the points in the reviewboard (still working on 
another rev which does.)  However it does have the path security validation, 
the addition of {{dfs.client.domain.socket.data.traffic}}, some refactoring of 
BlockReaderFactory and the addition of DomainSocketFactory, and renaming of 
{{getBindPath}} to {{getBoundPath}}.

> DFS read performance suboptimal when client co-located on nodes with data
> -
>
> Key: HDFS-347
> URL: https://issues.apache.org/jira/browse/HDFS-347
> Project: Hadoop HDFS
>  Issue Type: Improvement
>  Components: datanode, hdfs-client, performance
>Reporter: George Porter
>Assignee: Colin Patrick McCabe
> Attachments: all.tsv, BlockReaderLocal1.txt, HADOOP-4801.1.patch, 
> HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-016_cleaned.patch, 
> HDFS-347.016.patch, HDFS-347.017.clean.patch, HDFS-347.017.patch, 
> HDFS-347.018.clean.patch, HDFS-347.018.patch2, HDFS-347.019.patch, 
> HDFS-347.020.patch, HDFS-347.021.patch, HDFS-347.022.patch, 
> HDFS-347.024.patch, HDFS-347.025.patch, HDFS-347.026.patch, 
> HDFS-347.027.patch, HDFS-347-branch-20-append.txt, hdfs-347.png, 
> hdfs-347.txt, local-reads-doc
>
>
> One of the major strategies Hadoop uses to get scalable data processing is to 
> move the code to the data.  However, putting the DFS client on the same 
> physical node as the data blocks it acts on doesn't improve read performance 
> as much as expected.
> After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem 
> is due to the HDFS streaming protocol causing many more read I/O operations 
> (iops) than necessary.  Consider the case of a DFSClient fetching a 64 MB 
> disk block from the DataNode process (running in a separate JVM) running on 
> the same machine.  The DataNode will satisfy the single disk block request by 
> sending data back to the HDFS client in 64-KB chunks.  In BlockSender.java, 
> this is done in the sendChunk() method, relying on Java's transferTo() 
> method.  Depending on the host O/S and JVM implementation, transferTo() is 
> implemented as either a sendfilev() syscall or a pair of mmap() and write().  
> In either case, each chunk is read from the disk by issuing a separate I/O 
> operation for each chunk.  The result is that the single request for a 64-MB 
> block ends up hitting the disk as over a thousand smaller requests for 64-KB 
> each.
> Since the DFSClient runs in a different JVM and process than the DataNode, 
> shuttling data from the disk to the DFSClient also results in context 
> switches each time network packets get sent (in this case, the 64-kb chunk 
> turns into a large number of 1500 byte packet send operations).  Thus we see 
> a large number of context switches for each block send operation.
> I'd like to get some feedback on the best way to address this, but I think 
> providing a mechanism for a DFSClient to directly open data blocks that 
> happen to be on the same machine.  It could do this by examining the set of 
> LocatedBlocks returned by the NameNode, marking those that should be resident 
> on the local host.  Since the DataNode and DFSClient (probably) share the 
> same hadoop configuration, the DFSClient should be able to find the files 
> holding the block data, and it could directly open them and send data back to 
> the client.  This would avoid the context switches imposed by the network 
> layer, and would allow for much larger read buffers than 64KB, which should 
> reduce the number of iops imposed by each read block operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2012-12-14 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532901#comment-13532901
 ] 

Andy Isaacson commented on HDFS-4253:
-

bq. I don't see any reason why shuffle(a) could not be equal to shuffle(b), for 
two completely unrelated DatanodeIDs a and b.

That's true, equality is possible.  It's very unlikely given that we're 
choosing N items (where N is the replication count of a block, so nearly always 
3, sometimes 10, possibly as absurdly high as 50) from the range of 
{{Random#NextInt}} which is about 2**32.  The algorithm does something 
reasonable in the case that the shuffle has a collision (it puts the items in 
some order, either stable or not, and either result is fine for the rest of the 
algorithm). It would be possible to remove the possibility of collisions, but I 
don't know how to do that quickly with minimal code.  So the current 
implementation seemed to strike a nice balance between the desired behavior, 
efficient and easily understandable code, and low algorithmic complexity.

bq. It also seems better to just use hashCode, rather than creating your own 
random set of random ints associated with objects.

It's important that we get a different answer each time 
{{pseudoSortByDistance}} is invoked; that randomization is what spreads the 
read load out across the replicas. So using a stable value like hashCode would 
defeat that goal of this change.  (Possibly it might be true that hashCode 
ordering would be different in different {{DFSClient}} instances on different 
nodes, but I see no guarantee of that, and even if it's true, depending on such 
a subtle implementation detail would be dangerous. And it still doesn't resolve 
the issue that a single DFSClient should pick different replicas from a given 
class, for various reads of a given block.)

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2012-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532896#comment-13532896
 ] 

Hadoop QA commented on HDFS-4253:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12561062/hdfs4253-2.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3667//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3667//console

This message is automatically generated.

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them

2012-12-14 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532895#comment-13532895
 ] 

liang xie commented on HDFS-3429:
-

and the hbase-secific issue is :  HBASE-5074 , fixed at 0.94.0

> DataNode reads checksums even if client does not need them
> --
>
> Key: HDFS-3429
> URL: https://issues.apache.org/jira/browse/HDFS-3429
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, 
> hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt
>
>
> Currently, even if the client does not want to verify checksums, the datanode 
> reads them anyway and sends them over the wire. This means that performance 
> improvements like HBase's application-level checksums don't have much benefit 
> when reading through the datanode, since the DN is still causing seeks into 
> the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them

2012-12-14 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532891#comment-13532891
 ] 

liang xie commented on HDFS-3429:
-

O, [~tlipcon], you missed my words: "without patch"

the strace showed the statistic without patch.  
After applied the patch, i could not see so much meta files be opened

> DataNode reads checksums even if client does not need them
> --
>
> Key: HDFS-3429
> URL: https://issues.apache.org/jira/browse/HDFS-3429
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, 
> hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt
>
>
> Currently, even if the client does not want to verify checksums, the datanode 
> reads them anyway and sends them over the wire. This means that performance 
> improvements like HBase's application-level checksums don't have much benefit 
> when reading through the datanode, since the DN is still causing seeks into 
> the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2012-12-14 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532864#comment-13532864
 ] 

Colin Patrick McCabe commented on HDFS-4253:


Thanks for clarifying that.  I still think there's a problem, though-- I don't 
see any reason why shuffle(a) could not be equal to shuffle(b), for two 
completely unrelated DatanodeIDs a and b.  This could be fixed by checking 
something that's supposed to be unique in the case where the two agree-- like 
the name field.  It also seems better to just use {{hashCode}}, rather than 
creating your own random set of random ints associated with objects.

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access

2012-12-14 Thread Eli Collins (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532860#comment-13532860
 ] 

Eli Collins commented on HDFS-4315:
---

Nice find!

+1 pending jenkins


> DNs with multiple BPs can have BPOfferServices fail to start due to 
> unsynchronized map access
> -
>
> Key: HDFS-4315
> URL: https://issues.apache.org/jira/browse/HDFS-4315
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.2-alpha
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-4315.patch
>
>
> In some nightly test runs we've seen pretty frequent failures of 
> TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an 
> unsynchronized map access in the DataStorage class.
> More details in the first comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access

2012-12-14 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-4315:
-

Status: Patch Available  (was: Open)

> DNs with multiple BPs can have BPOfferServices fail to start due to 
> unsynchronized map access
> -
>
> Key: HDFS-4315
> URL: https://issues.apache.org/jira/browse/HDFS-4315
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.2-alpha
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-4315.patch
>
>
> In some nightly test runs we've seen pretty frequent failures of 
> TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an 
> unsynchronized map access in the DataStorage class.
> More details in the first comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access

2012-12-14 Thread Aaron T. Myers (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aaron T. Myers updated HDFS-4315:
-

Attachment: HDFS-4315.patch

Here's a patch which addresses the issue. I've been looping the test for an 
hour now with no failures, whereas before it used to fail pretty reliably 
within 10 minutes. I'll keep it looping over the weekend and see how it goes.

This patch also takes the liberty of re-enabling the DN log in 
TestWebHdfsWithMultipleNameNodes, so that we can better see the root cause of 
later failures.

> DNs with multiple BPs can have BPOfferServices fail to start due to 
> unsynchronized map access
> -
>
> Key: HDFS-4315
> URL: https://issues.apache.org/jira/browse/HDFS-4315
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.2-alpha
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
> Attachments: HDFS-4315.patch
>
>
> In some nightly test runs we've seen pretty frequent failures of 
> TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an 
> unsynchronized map access in the DataStorage class.
> More details in the first comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access

2012-12-14 Thread Aaron T. Myers (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532857#comment-13532857
 ] 

Aaron T. Myers commented on HDFS-4315:
--

In all of the failing test runs that I saw, the client would end up failing 
with an error like the following:

{noformat}
2012-12-14 16:30:36,818 WARN  hdfs.DFSClient (DFSOutputStream.java:run(562)) - 
DataStreamer Exception
java.io.IOException: Failed to add a datanode.  User may turn off this feature 
by setting dfs.client.block.write.replace-datanode-on-failure.policy in 
configuration, where the current policy is DEFAULT.  (Nodes: 
current=[127.0.0.1:52552, 127.0.0.1:43557], original=[127.0.0.1:43557, 
127.0.0.1:52552])
  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:792)
  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:852)
  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:958)
 
  at 
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:469)
{noformat}

This suggests that either an entire DN or one of the BPOfferServices of one of 
the DNs was not starting correctly, or had not started by the time the client 
was trying to access it. Unfortunately, TestWebHdfsWithMultipleNameNodes 
disables the DN logger, so it wasn't obvious what was causing that problem. 
Upon changing the test to not disable the logger and looping the test, I would 
occasionally see an error like the following:

{noformat}
java.lang.NullPointerException
  at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:850)
  at 
org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:819)
  at 
org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:308)
  at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218)
  at 
org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660)
  at java.lang.Thread.run(Thread.java:662)
{noformat}

This error would cause one of the BPOfferServices in one of the DNs to not come 
up. The reason for this is that concurrent, unsynchronized puts to the HashMap 
DataStorage#bpStorageMap results in undefined behavior, including 
previously-included entries no longer appearing to be in the map.

> DNs with multiple BPs can have BPOfferServices fail to start due to 
> unsynchronized map access
> -
>
> Key: HDFS-4315
> URL: https://issues.apache.org/jira/browse/HDFS-4315
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode
>Affects Versions: 2.0.2-alpha
>Reporter: Aaron T. Myers
>Assignee: Aaron T. Myers
>
> In some nightly test runs we've seen pretty frequent failures of 
> TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an 
> unsynchronized map access in the DataStorage class.
> More details in the first comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access

2012-12-14 Thread Aaron T. Myers (JIRA)
Aaron T. Myers created HDFS-4315:


 Summary: DNs with multiple BPs can have BPOfferServices fail to 
start due to unsynchronized map access
 Key: HDFS-4315
 URL: https://issues.apache.org/jira/browse/HDFS-4315
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: datanode
Affects Versions: 2.0.2-alpha
Reporter: Aaron T. Myers
Assignee: Aaron T. Myers


In some nightly test runs we've seen pretty frequent failures of 
TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an 
unsynchronized map access in the DataStorage class.

More details in the first comment.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4314) failure to set sticky bit regression on branch-trunk-win

2012-12-14 Thread Chris Nauroth (JIRA)
Chris Nauroth created HDFS-4314:
---

 Summary: failure to set sticky bit regression on branch-trunk-win
 Key: HDFS-4314
 URL: https://issues.apache.org/jira/browse/HDFS-4314
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: hdfs-client
Affects Versions: trunk-win
Reporter: Chris Nauroth
Assignee: Chris Nauroth
 Fix For: trunk-win


The problem is visible by running {{TestDFSShell#testFilePermissions}}.  The 
test fails on trying to set sticky bit.  The problem is that branch-trunk-win 
accidentally merged in a branch-1 change in 
{{RawLocalFileSystem#setPermission}} to call {{FileUtil#setPermission}}, which 
sets permissions using Java {{File}} API.  There is no way to set sticky bit 
through this API.  We need to switch back to the trunk implementation of 
{{RawLocalFileSystem#setPermission}}, which uses either native code or a shell 
call to external chmod.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2012-12-14 Thread Andy Isaacson (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andy Isaacson updated HDFS-4253:


Attachment: hdfs4253-2.txt

Avoid extra a.equals(b) by checking {{aIsLocal && bIsLocal}} instead.  On 
average for a given sort this will result in fewer calls to {{.equals()}}.

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance

2012-12-14 Thread Andy Isaacson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532779#comment-13532779
 ] 

Andy Isaacson commented on HDFS-4253:
-

bq. The bug comes later, where you always return 1 if neither Node is on the 
local rack. This is wrong; it violates anticommutation (see link).

But that's not what the code does.  If neither Node is on the local rack, then 
{{aIsLocalRack == bIsLocalRack}} and we use the shuffle for a total ordering, 
right here:
{code}
858 if (aIsLocalRack == bIsLocalRack) {
859   int ai = shuffle.get(a), bi = shuffle.get(b);
860   if (ai < bi) {
861 return -1;
862   } else if (ai > bi) {
863 return 1;
864   } else {
865 return 0;
866   }
{code}
The final {{else}} is only reached when {{bIsLocalRack && !aIsLocalRack}}. So 
I'm pretty sure this implementation does satisfy anticommutation.

> block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
> -
>
> Key: HDFS-4253
> URL: https://issues.apache.org/jira/browse/HDFS-4253
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Andy Isaacson
> Attachments: hdfs4253-1.txt, hdfs4253.txt
>
>
> When many nodes (10) read from the same block simultaneously, we get 
> asymmetric distribution of read load.  This can result in slow block reads 
> when one replica is serving most of the readers and the other replicas are 
> idle.  The busy DN bottlenecks on its network link.
> This is especially visible with large block sizes and high replica counts (I 
> reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication 
> 5), but the same behavior happens on a small scale with normal-sized blocks 
> and replication=3.
> The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which 
> explicitly does not try to spread traffic among replicas in a given rack -- 
> it only randomizes usage for off-rack replicas.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them

2012-12-14 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532724#comment-13532724
 ] 

Todd Lipcon commented on HDFS-3429:
---

Hi Liang. I'm not sure if 0.94.2 has the code right to take advantage of this 
new feature quite yet -- given you see a bunch of the .meta files being read, 
it seems like it doesn't. So, that would explain why you don't see a 
performance difference.

> DataNode reads checksums even if client does not need them
> --
>
> Key: HDFS-3429
> URL: https://issues.apache.org/jira/browse/HDFS-3429
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, 
> hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt
>
>
> Currently, even if the client does not want to verify checksums, the datanode 
> reads them anyway and sends them over the wire. This means that performance 
> improvements like HBase's application-level checksums don't have much benefit 
> when reading through the datanode, since the DN is still causing seeks into 
> the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3465) 2NN doesn't start with fs.defaultFS set to a viewfs URI unless service RPC address is also set

2012-12-14 Thread Joseph Kniest (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532686#comment-13532686
 ] 

Joseph Kniest commented on HDFS-3465:
-

Hi, I am new to HDFS dev and I would like to take this issue as my first. It 
may take a while because it's my first issue and because of my schedule but I 
will do my best to be as prompt as possible. Thanks!

> 2NN doesn't start with fs.defaultFS set to a viewfs URI unless service RPC 
> address is also set
> --
>
> Key: HDFS-3465
> URL: https://issues.apache.org/jira/browse/HDFS-3465
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: federation, namenode
>Affects Versions: 2.0.0-alpha
>Reporter: Eli Collins
>  Labels: newbie
>
> Looks like the 2NN first tries servicerpc-address then falls back on 
> fs.defaultFS, which won't work in the case of federation since fs.defaultFS 
> doesn't refer to an RPC address. Instead, the 2NN should first check 
> servicerpc-address, then rpc-address, then fall back on fs.defaultFS.
> {noformat}
> Exception in thread "main" java.lang.IllegalArgumentException: Invalid
> URI for NameNode address (check fs.defaultFS): viewfs:/// has no
> authority.
>at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:315)
>at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:303)
>at 
> org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:296)
>at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:214)
>at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:178)
>at 
> org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:582)
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4313) MiniDFSCluster throws NPE if umask is more permissive than 022

2012-12-14 Thread Luke Lu (JIRA)
Luke Lu created HDFS-4313:
-

 Summary: MiniDFSCluster throws NPE if umask is more permissive 
than 022
 Key: HDFS-4313
 URL: https://issues.apache.org/jira/browse/HDFS-4313
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: test
Affects Versions: 1.1.1
Reporter: Luke Lu
Priority: Minor


MiniDFSCluster startup throws NPE if umask is more permissive e.g. 002 than 022.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing

2012-12-14 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532495#comment-13532495
 ] 

Harsh J commented on HDFS-3912:
---

bq. Are you sure? It's committed in branch-1?

Yes, branch-1 has this as a backport commit, whose different patch is attached 
as well.

> Detecting and avoiding stale datanodes for writing
> --
>
> Key: HDFS-3912
> URL: https://issues.apache.org/jira/browse/HDFS-3912
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, 
> HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, 
> HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, 
> HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, 
> HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch
>
>
> 1. Make stale timeout adaptive to the number of nodes marked stale in the 
> cluster.
> 2. Consider having a separate configuration for write skipping the stale 
> nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing

2012-12-14 Thread Harsh J (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532493#comment-13532493
 ] 

Harsh J commented on HDFS-3912:
---

bq. FYI: This patch is missing the branch-2 patch. After applying HDFS-3703 for 
branch-2, it's missing the DFS_NAMENODE_CHECK_STALE_DATANODE_DEFAULT settings, 
etc..

The diff may be dependent on the JIRA you mention, but perhaps not the patch 
itself. We merged the trunk commit directly into branch-2, as 
viewable/downloadable here: view at 
http://svn.apache.org/viewvc?view=revision&revision=1397219 and download at 
http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java?revision=1397219&view=co

If you use git locally, you can also add a remote and cherry-pick it out I 
guess.

> Detecting and avoiding stale datanodes for writing
> --
>
> Key: HDFS-3912
> URL: https://issues.apache.org/jira/browse/HDFS-3912
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, 
> HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, 
> HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, 
> HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, 
> HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch
>
>
> 1. Make stale timeout adaptive to the number of nodes marked stale in the 
> cluster.
> 2. Consider having a separate configuration for write skipping the stale 
> nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing

2012-12-14 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532488#comment-13532488
 ] 

nkeywal commented on HDFS-3912:
---

Are you sure? It's committed in branch-1?

> Detecting and avoiding stale datanodes for writing
> --
>
> Key: HDFS-3912
> URL: https://issues.apache.org/jira/browse/HDFS-3912
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, 
> HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, 
> HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, 
> HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, 
> HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch
>
>
> 1. Make stale timeout adaptive to the number of nodes marked stale in the 
> cluster.
> 2. Consider having a separate configuration for write skipping the stale 
> nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing

2012-12-14 Thread Jeremy Carroll (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532482#comment-13532482
 ] 

Jeremy Carroll commented on HDFS-3912:
--

Basically this patch requires HDFS-3601 (Version 3.0). So there is no Branch 
2.0 patch on the ticket.

> Detecting and avoiding stale datanodes for writing
> --
>
> Key: HDFS-3912
> URL: https://issues.apache.org/jira/browse/HDFS-3912
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, 
> HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, 
> HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, 
> HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, 
> HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch
>
>
> 1. Make stale timeout adaptive to the number of nodes marked stale in the 
> cluster.
> 2. Consider having a separate configuration for write skipping the stale 
> nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing

2012-12-14 Thread Jeremy Carroll (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532457#comment-13532457
 ] 

Jeremy Carroll commented on HDFS-3912:
--

FYI: This patch is missing the branch-2 patch. After applying HDFS-3703 for 
branch-2, it's missing the DFS_NAMENODE_CHECK_STALE_DATANODE_DEFAULT settings, 
etc..

> Detecting and avoiding stale datanodes for writing
> --
>
> Key: HDFS-3912
> URL: https://issues.apache.org/jira/browse/HDFS-3912
> Project: Hadoop HDFS
>  Issue Type: Sub-task
>Affects Versions: 3.0.0
>Reporter: Jing Zhao
>Assignee: Jing Zhao
> Fix For: 1.2.0, 2.0.3-alpha
>
> Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, 
> HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, 
> HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, 
> HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, 
> HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch
>
>
> 1. Make stale timeout adaptive to the number of nodes marked stale in the 
> cluster.
> 2. Consider having a separate configuration for write skipping the stale 
> nodes.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc

2012-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532388#comment-13532388
 ] 

Hadoop QA commented on HDFS-4312:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560969/HDFS-4312.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3666//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3666//console

This message is automatically generated.

> fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
> --
>
> Key: HDFS-4312
> URL: https://issues.apache.org/jira/browse/HDFS-4312
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Attachments: HDFS-4312.patch
>
>
> TestSecureNameNode does not work on Java6 without 
> "dfs.web.authentication.kerberos.principal" config property set.
> Also the following improved:
> 1) keytab files are checked for existence and readability to provide 
> fast-fail on config error.
> 2) added comment to TestSecureNameNode describing the required sys props.
> 3) string literals replaced with config constants.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4310) fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode

2012-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532316#comment-13532316
 ] 

Hudson commented on HDFS-4310:
--

Integrated in Hadoop-Mapreduce-trunk #1285 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1285/])
HDFS-4310. fix test 
org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode Contributed by 
Ivan A. Veselovsky. (Revision 1421560)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421560
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java


> fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode
> ---
>
> Key: HDFS-4310
> URL: https://issues.apache.org/jira/browse/HDFS-4310
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Fix For: 3.0.0
>
> Attachments: HDFS-4310.patch
>
>
> the test org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode 
> catches exceptions and does not re-throw them. Due to that it passes even if 
> it actually failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4307) SocketCache should use monotonic time

2012-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532313#comment-13532313
 ] 

Hudson commented on HDFS-4307:
--

Integrated in Hadoop-Mapreduce-trunk #1285 (See 
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1285/])
HDFS-4307. SocketCache should use monotonic time. Contributed by Colin 
Patrick McCabe. (Revision 1421572)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421572
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java


> SocketCache should use monotonic time
> -
>
> Key: HDFS-4307
> URL: https://issues.apache.org/jira/browse/HDFS-4307
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.3-alpha
>
> Attachments: HDFS-4307.001.patch, HDFS-4307.002.patch
>
>
> {{SocketCache}} should use monotonic time, not wall-clock time.  Otherwise, 
> if the time is adjusted by ntpd or a system administrator, sockets could be 
> either abrupbtly expired, or left in the cache indefinitely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc

2012-12-14 Thread Ivan A. Veselovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan A. Veselovsky updated HDFS-4312:
-

Affects Version/s: 3.0.0
   Status: Patch Available  (was: Open)

> fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
> --
>
> Key: HDFS-4312
> URL: https://issues.apache.org/jira/browse/HDFS-4312
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Attachments: HDFS-4312.patch
>
>
> TestSecureNameNode does not work on Java6 without 
> "dfs.web.authentication.kerberos.principal" config property set.
> Also the following improved:
> 1) keytab files are checked for existence and readability to provide 
> fast-fail on config error.
> 2) added comment to TestSecureNameNode describing the required sys props.
> 3) string literals replaced with config constants.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc

2012-12-14 Thread Ivan A. Veselovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan A. Veselovsky updated HDFS-4312:
-

Attachment: HDFS-4312.patch

> fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
> --
>
> Key: HDFS-4312
> URL: https://issues.apache.org/jira/browse/HDFS-4312
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Attachments: HDFS-4312.patch
>
>
> TestSecureNameNode does not work on Java6 without 
> "dfs.web.authentication.kerberos.principal" config property set.
> Also the following improved:
> 1) keytab files are checked for existence and readability to provide 
> fast-fail on config error.
> 2) added comment to TestSecureNameNode describing the required sys props.
> 3) string literals replaced with config constants.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc

2012-12-14 Thread Ivan A. Veselovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan A. Veselovsky updated HDFS-4312:
-

Summary: fix test TestSecureNameNode and improve test 
TestSecureNameNodeWithExternalKdc  (was: fix test TestSecureNameNode and 
improve test TestSecureNameNode)

> fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
> --
>
> Key: HDFS-4312
> URL: https://issues.apache.org/jira/browse/HDFS-4312
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
>
> TestSecureNameNode does not work on Java6 without 
> "dfs.web.authentication.kerberos.principal" config property set.
> Also the following improved:
> 1) keytab files are checked for existence and readability to provide 
> fast-fail on config error.
> 2) added comment to TestSecureNameNode describing the required sys props.
> 3) string literals replaced with config constants.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNode

2012-12-14 Thread Ivan A. Veselovsky (JIRA)
Ivan A. Veselovsky created HDFS-4312:


 Summary: fix test TestSecureNameNode and improve test 
TestSecureNameNode
 Key: HDFS-4312
 URL: https://issues.apache.org/jira/browse/HDFS-4312
 Project: Hadoop HDFS
  Issue Type: Bug
Reporter: Ivan A. Veselovsky
Assignee: Ivan A. Veselovsky


TestSecureNameNode does not work on Java6 without 
"dfs.web.authentication.kerberos.principal" config property set.

Also the following improved:
1) keytab files are checked for existence and readability to provide fast-fail 
on config error.
2) added comment to TestSecureNameNode describing the required sys props.
3) string literals replaced with config constants.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4309) Multithreaded get through the Cache FileSystem Object to lead LeaseChecker memory leak

2012-12-14 Thread ChenFolin (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532291#comment-13532291
 ] 

ChenFolin commented on HDFS-4309:
-

Hi Aaron T. Myers,
When I execute "dev-support/test-patch.sh patch",that causes many errors,such 
as:
"org.apache.hadoop.record.RecordComparator is deprecated."
and the code is:
{code}
@Deprecated
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class RecordComparator extends WritableComparator {
{code}

So,"dev-support/test-patch.sh patch" exec failed.And now,how can I do for it?

==
==
Determining number of patched javac warnings.
==
==


mvn clean test -DskipTests -DHadoopPatchProcess -Pnative -Ptest-patch > 
/tmp/patchJavacWarnings.txt 2>&1




{color:red}-1 overall{color}.  

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:red}-1 javac{color:red}.  The patch appears to cause the build to 
fail.




==
==
Finished build.
==
==

> Multithreaded get through the Cache FileSystem Object to lead LeaseChecker 
> memory leak
> --
>
> Key: HDFS-4309
> URL: https://issues.apache.org/jira/browse/HDFS-4309
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: hdfs-client
>Affects Versions: 0.20.205.0, 0.23.1, 0.23.4, 2.0.1-alpha, 2.0.2-alpha
>Reporter: MaWenJin
>  Labels: patch
> Attachments: jmap2.log
>
>   Original Estimate: 204h
>  Remaining Estimate: 204h
>
> If multiple threads concurrently execute the following methods will result in 
> the thread fs = createFileSystem (uri, conf) method is called.And create 
> multiple DFSClient, start at the same time LeaseChecker daemon thread, may 
> not be able to use shutdownhook close it after the process, resulting in a 
> memory leak.
> {code}
> private FileSystem getInternal(URI uri, Configuration conf, Key key) throws 
> IOException{
>   FileSystem fs = null;
>   synchronized (this) {
> fs = map.get(key);
>   }
>   if (fs != null) {
> return fs;
>   }
>   //  this is 
>   fs = createFileSystem(uri, conf);
>   synchronized (this) {  // refetch the lock again
> FileSystem oldfs = map.get(key);
> if (oldfs != null) { // a file system is created while lock is 
> releasing
>   fs.close(); // close the new file system
>   return oldfs;  // return the old file system
> }
> // now insert the new file system into the map
> if (map.isEmpty() && !clientFinalizer.isAlive()) {
>   Runtime.getRuntime().addShutdownHook(clientFinalizer);
> }
> fs.key = key;
> map.put(key, fs);
> if (conf.getBoolean("fs.automatic.close", true)) {
>   toAutoClose.add(key);
> }
> return fs;
>   }
> }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4310) fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode

2012-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532289#comment-13532289
 ] 

Hudson commented on HDFS-4310:
--

Integrated in Hadoop-Hdfs-trunk #1254 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1254/])
HDFS-4310. fix test 
org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode Contributed by 
Ivan A. Veselovsky. (Revision 1421560)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421560
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java


> fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode
> ---
>
> Key: HDFS-4310
> URL: https://issues.apache.org/jira/browse/HDFS-4310
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Fix For: 3.0.0
>
> Attachments: HDFS-4310.patch
>
>
> the test org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode 
> catches exceptions and does not re-throw them. Due to that it passes even if 
> it actually failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4307) SocketCache should use monotonic time

2012-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532286#comment-13532286
 ] 

Hudson commented on HDFS-4307:
--

Integrated in Hadoop-Hdfs-trunk #1254 (See 
[https://builds.apache.org/job/Hadoop-Hdfs-trunk/1254/])
HDFS-4307. SocketCache should use monotonic time. Contributed by Colin 
Patrick McCabe. (Revision 1421572)

 Result = FAILURE
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421572
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java


> SocketCache should use monotonic time
> -
>
> Key: HDFS-4307
> URL: https://issues.apache.org/jira/browse/HDFS-4307
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.3-alpha
>
> Attachments: HDFS-4307.001.patch, HDFS-4307.002.patch
>
>
> {{SocketCache}} should use monotonic time, not wall-clock time.  Otherwise, 
> if the time is adjusted by ntpd or a system administrator, sockets could be 
> either abrupbtly expired, or left in the cache indefinitely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4307) SocketCache should use monotonic time

2012-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532229#comment-13532229
 ] 

Hudson commented on HDFS-4307:
--

Integrated in Hadoop-Yarn-trunk #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/65/])
HDFS-4307. SocketCache should use monotonic time. Contributed by Colin 
Patrick McCabe. (Revision 1421572)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421572
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java


> SocketCache should use monotonic time
> -
>
> Key: HDFS-4307
> URL: https://issues.apache.org/jira/browse/HDFS-4307
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.0.3-alpha
>Reporter: Colin Patrick McCabe
>Assignee: Colin Patrick McCabe
>Priority: Minor
> Fix For: 2.0.3-alpha
>
> Attachments: HDFS-4307.001.patch, HDFS-4307.002.patch
>
>
> {{SocketCache}} should use monotonic time, not wall-clock time.  Otherwise, 
> if the time is adjusted by ntpd or a system administrator, sockets could be 
> either abrupbtly expired, or left in the cache indefinitely.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4310) fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode

2012-12-14 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532232#comment-13532232
 ] 

Hudson commented on HDFS-4310:
--

Integrated in Hadoop-Yarn-trunk #65 (See 
[https://builds.apache.org/job/Hadoop-Yarn-trunk/65/])
HDFS-4310. fix test 
org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode Contributed by 
Ivan A. Veselovsky. (Revision 1421560)

 Result = SUCCESS
atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421560
Files : 
* /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
* 
/hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java


> fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode
> ---
>
> Key: HDFS-4310
> URL: https://issues.apache.org/jira/browse/HDFS-4310
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Fix For: 3.0.0
>
> Attachments: HDFS-4310.patch
>
>
> the test org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode 
> catches exceptions and does not re-throw them. Due to that it passes even if 
> it actually failed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos

2012-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532195#comment-13532195
 ] 

Hadoop QA commented on HDFS-4311:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560935/HDFS-4311.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 2 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs-httpfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3665//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3665//console

This message is automatically generated.

> repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
> ---
>
> Key: HDFS-4311
> URL: https://issues.apache.org/jira/browse/HDFS-4311
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Attachments: HDFS-4311.patch
>
>
> Some of the test cases in this test class are failing because they are 
> affected by static state changed by the previous test cases. Namely this is 
> the static field org.apache.hadoop.security.UserGroupInformation.loginUser .
> The suggested patch solves this problem.
> Besides, the following improvements are done:
> 1) parametrized the user principal and keytab values via system properties;
> 2) shutdown of the Jetty server and the minicluster between the test cases is 
> added to make the test methods independent on each other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos

2012-12-14 Thread Ivan A. Veselovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan A. Veselovsky updated HDFS-4311:
-

Affects Version/s: 2.0.3-alpha
   3.0.0
   Status: Patch Available  (was: Open)

> repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
> ---
>
> Key: HDFS-4311
> URL: https://issues.apache.org/jira/browse/HDFS-4311
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 3.0.0, 2.0.3-alpha
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Attachments: HDFS-4311.patch
>
>
> Some of the test cases in this test class are failing because they are 
> affected by static state changed by the previous test cases. Namely this is 
> the static field org.apache.hadoop.security.UserGroupInformation.loginUser .
> The suggested patch solves this problem.
> Besides, the following improvements are done:
> 1) parametrized the user principal and keytab values via system properties;
> 2) shutdown of the Jetty server and the minicluster between the test cases is 
> added to make the test methods independent on each other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos

2012-12-14 Thread Ivan A. Veselovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan A. Veselovsky updated HDFS-4311:
-

Attachment: HDFS-4311.patch

> repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
> ---
>
> Key: HDFS-4311
> URL: https://issues.apache.org/jira/browse/HDFS-4311
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
> Attachments: HDFS-4311.patch
>
>
> Some of the test cases in this test class are failing because they are 
> affected by static state changed by the previous test cases. Namely this is 
> the static field org.apache.hadoop.security.UserGroupInformation.loginUser .
> The suggested patch solves this problem.
> Besides, the following improvements are done:
> 1) parametrized the user principal and keytab values via system properties;
> 2) shutdown of the Jetty server and the minicluster between the test cases is 
> added to make the test methods independent on each other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Moved] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos

2012-12-14 Thread Ivan A. Veselovsky (JIRA)

 [ 
https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ivan A. Veselovsky moved HADOOP-9143 to HDFS-4311:
--

Key: HDFS-4311  (was: HADOOP-9143)
Project: Hadoop HDFS  (was: Hadoop Common)

> repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
> ---
>
> Key: HDFS-4311
> URL: https://issues.apache.org/jira/browse/HDFS-4311
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Ivan A. Veselovsky
>Assignee: Ivan A. Veselovsky
>
> Some of the test cases in this test class are failing because they are 
> affected by static state changed by the previous test cases. Namely this is 
> the static field org.apache.hadoop.security.UserGroupInformation.loginUser .
> The suggested patch solves this problem.
> Besides, the following improvements are done:
> 1) parametrized the user principal and keytab values via system properties;
> 2) shutdown of the Jetty server and the minicluster between the test cases is 
> added to make the test methods independent on each other.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them

2012-12-14 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532161#comment-13532161
 ] 

liang xie commented on HDFS-3429:
-

still no obvious difference be found at another 100%read scenario withou 
IO-bound

i did "strace -p  -f -tt -T -e trace=file -o bbb" during a several 
minutes run(without patch),then:
grep "current/finalized" bbb|wc -l
16905
grep meta bbb|wc -l
9858
grep meta bbb|grep open|wc -l
3286
grep meta bbb|grep stat|wc -l
6572
grep meta bbb|grep "\".*\"" -o|sort -n |uniq -c|wc -l
303
And most of those meta files size are several hundred of kilobytes, further 
more, our OS has a default read_ahead_kb: 128
so the benefit was not obvious seems make sense as well. Any idea, [~tlipcon] ?

But i am +1 for this patch, due to it can reduce some unnecessary IO & system 
call

> DataNode reads checksums even if client does not need them
> --
>
> Key: HDFS-3429
> URL: https://issues.apache.org/jira/browse/HDFS-3429
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: datanode, performance
>Affects Versions: 2.0.0-alpha
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, 
> hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt
>
>
> Currently, even if the client does not want to verify checksums, the datanode 
> reads them anyway and sends them over the wire. This means that performance 
> improvements like HBase's application-level checksums don't have much benefit 
> when reading through the datanode, since the DN is still causing seeks into 
> the checksum file.
> (Credit goes to Dhruba for discovering this - filing on his behalf)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HDFS-4140) fuse-dfs handles open(O_TRUNC) poorly

2012-12-14 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532152#comment-13532152
 ] 

Hadoop QA commented on HDFS-4140:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12560910/HDFS-4140.008.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.TestPersistBlocks
  
org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/3664//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3664//console

This message is automatically generated.

> fuse-dfs handles open(O_TRUNC) poorly
> -
>
> Key: HDFS-4140
> URL: https://issues.apache.org/jira/browse/HDFS-4140
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: fuse-dfs
>Affects Versions: 2.0.2-alpha
>Reporter: Andy Isaacson
>Assignee: Colin Patrick McCabe
> Attachments: HDFS-4140.003.patch, HDFS-4140.004.patch, 
> HDFS-4140.005.patch, HDFS-4140.006.patch, HDFS-4140.007.patch, 
> HDFS-4140.008.patch
>
>
> fuse-dfs handles open(O_TRUNC) poorly.
> It is converted to multiple fuse operations.  Those multiple fuse operations 
> often fail (for example, calling fuse_truncate_impl() while a file is also 
> open for write results in a "multiple writers!" exception.)
> One easy way to see the problem is to run the following sequence of shell 
> commands:
> {noformat}
> ubuntu@ubu-cdh-0:~$ echo foo > /export/hdfs/tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a
> total 0
> -rw-r--r-- 1 ubuntu hadoop 4 Nov  1 15:21 t1.txt
> ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a
> Found 1 items
> -rw-r--r--   3 ubuntu hadoop  4 2012-11-01 15:21 /tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ echo bar > /export/hdfs/tmp/a/t1.txt
> ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a
> total 0
> -rw-r--r-- 1 ubuntu hadoop 0 Nov  1 15:22 t1.txt
> ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a
> Found 1 items
> -rw-r--r--   3 ubuntu hadoop  0 2012-11-01 15:22 /tmp/a/t1.txt
> {noformat}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira