[jira] [Updated] (HDFS-4227) Document dfs.namenode.resource.*
[ https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daisuke Kobayashi updated HDFS-4227: Attachment: HDFS-4227.patch new patch attached. Can you review? > Document dfs.namenode.resource.* > -- > > Key: HDFS-4227 > URL: https://issues.apache.org/jira/browse/HDFS-4227 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Eli Collins >Assignee: Daisuke Kobayashi > Labels: newbie > Attachments: HDFS-4227.patch > > > Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section > the in the HDFS docs that covers local directories. > {{dfs.namenode.resource.check.interval}} - the interval in ms at which the > NameNode resource checker runs (default is 5000) > {{dfs.namenode.resource.du.reserved}} - the amount of space to > reserve/require for a NN storage directory (default is 100mb) > {{dfs.namenode.resource.checked.volumes}} - a list of local directories for > the NN resource checker to check in addition to the local edits directories > (default is empty). > {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of > redundant NN storage volumes required (default is 1). If no redundant > resources are available we don't enter SM if there are sufficient required > resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Assigned] (HDFS-4227) Document dfs.namenode.resource.*
[ https://issues.apache.org/jira/browse/HDFS-4227?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daisuke Kobayashi reassigned HDFS-4227: --- Assignee: Daisuke Kobayashi > Document dfs.namenode.resource.* > -- > > Key: HDFS-4227 > URL: https://issues.apache.org/jira/browse/HDFS-4227 > Project: Hadoop HDFS > Issue Type: Bug > Components: documentation >Affects Versions: 1.0.0, 2.0.0-alpha >Reporter: Eli Collins >Assignee: Daisuke Kobayashi > Labels: newbie > > Let's document {{dfs.namenode.resource.*}} in hdfs-default.xml and a section > the in the HDFS docs that covers local directories. > {{dfs.namenode.resource.check.interval}} - the interval in ms at which the > NameNode resource checker runs (default is 5000) > {{dfs.namenode.resource.du.reserved}} - the amount of space to > reserve/require for a NN storage directory (default is 100mb) > {{dfs.namenode.resource.checked.volumes}} - a list of local directories for > the NN resource checker to check in addition to the local edits directories > (default is empty). > {{dfs.namenode.resource.checked.volumes.minimum}} - the minimum number of > redundant NN storage volumes required (default is 1). If no redundant > resources are available we don't enter SM if there are sufficient required > resources. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data
[ https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532938#comment-13532938 ] Hadoop QA commented on HDFS-347: {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561092/HDFS-347.027.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 14 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 2013 javac compiler warnings (more than the trunk's current 2012 warnings). {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3669//testReport/ Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/3669//artifact/trunk/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3669//console This message is automatically generated. > DFS read performance suboptimal when client co-located on nodes with data > - > > Key: HDFS-347 > URL: https://issues.apache.org/jira/browse/HDFS-347 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, hdfs-client, performance >Reporter: George Porter >Assignee: Colin Patrick McCabe > Attachments: all.tsv, BlockReaderLocal1.txt, HADOOP-4801.1.patch, > HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-016_cleaned.patch, > HDFS-347.016.patch, HDFS-347.017.clean.patch, HDFS-347.017.patch, > HDFS-347.018.clean.patch, HDFS-347.018.patch2, HDFS-347.019.patch, > HDFS-347.020.patch, HDFS-347.021.patch, HDFS-347.022.patch, > HDFS-347.024.patch, HDFS-347.025.patch, HDFS-347.026.patch, > HDFS-347.027.patch, HDFS-347-branch-20-append.txt, hdfs-347.png, > hdfs-347.txt, local-reads-doc > > > One of the major strategies Hadoop uses to get scalable data processing is to > move the code to the data. However, putting the DFS client on the same > physical node as the data blocks it acts on doesn't improve read performance > as much as expected. > After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem > is due to the HDFS streaming protocol causing many more read I/O operations > (iops) than necessary. Consider the case of a DFSClient fetching a 64 MB > disk block from the DataNode process (running in a separate JVM) running on > the same machine. The DataNode will satisfy the single disk block request by > sending data back to the HDFS client in 64-KB chunks. In BlockSender.java, > this is done in the sendChunk() method, relying on Java's transferTo() > method. Depending on the host O/S and JVM implementation, transferTo() is > implemented as either a sendfilev() syscall or a pair of mmap() and write(). > In either case, each chunk is read from the disk by issuing a separate I/O > operation for each chunk. The result is that the single request for a 64-MB > block ends up hitting the disk as over a thousand smaller requests for 64-KB > each. > Since the DFSClient runs in a different JVM and process than the DataNode, > shuttling data from the disk to the DFSClient also results in context > switches each time network packets get sent (in this case, the 64-kb chunk > turns into a large number of 1500 byte packet send operations). Thus we see > a large number of context switches for each block send operation. > I'd like to get some feedback on the best way to address this, but I think > providing a mechanism for a DFSClient to directly open data blocks that > happen to be on the same machine. It could do this by examining the set of > LocatedBlocks returned by the NameNode, marking those that should be resident > on the local host. Since the DataNode and DFSClient (probably) share the > same hadoop configuration, the DFSClient should be able to find the files > holding the block data, and it could directly open them and send data back to > the client. This would
[jira] [Commented] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access
[ https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532928#comment-13532928 ] Hadoop QA commented on HDFS-4315: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561073/HDFS-4315.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3668//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3668//console This message is automatically generated. > DNs with multiple BPs can have BPOfferServices fail to start due to > unsynchronized map access > - > > Key: HDFS-4315 > URL: https://issues.apache.org/jira/browse/HDFS-4315 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.0.2-alpha >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-4315.patch > > > In some nightly test runs we've seen pretty frequent failures of > TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an > unsynchronized map access in the DataStorage class. > More details in the first comment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-347) DFS read performance suboptimal when client co-located on nodes with data
[ https://issues.apache.org/jira/browse/HDFS-347?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Colin Patrick McCabe updated HDFS-347: -- Attachment: HDFS-347.027.patch This doesn't address all the points in the reviewboard (still working on another rev which does.) However it does have the path security validation, the addition of {{dfs.client.domain.socket.data.traffic}}, some refactoring of BlockReaderFactory and the addition of DomainSocketFactory, and renaming of {{getBindPath}} to {{getBoundPath}}. > DFS read performance suboptimal when client co-located on nodes with data > - > > Key: HDFS-347 > URL: https://issues.apache.org/jira/browse/HDFS-347 > Project: Hadoop HDFS > Issue Type: Improvement > Components: datanode, hdfs-client, performance >Reporter: George Porter >Assignee: Colin Patrick McCabe > Attachments: all.tsv, BlockReaderLocal1.txt, HADOOP-4801.1.patch, > HADOOP-4801.2.patch, HADOOP-4801.3.patch, HDFS-347-016_cleaned.patch, > HDFS-347.016.patch, HDFS-347.017.clean.patch, HDFS-347.017.patch, > HDFS-347.018.clean.patch, HDFS-347.018.patch2, HDFS-347.019.patch, > HDFS-347.020.patch, HDFS-347.021.patch, HDFS-347.022.patch, > HDFS-347.024.patch, HDFS-347.025.patch, HDFS-347.026.patch, > HDFS-347.027.patch, HDFS-347-branch-20-append.txt, hdfs-347.png, > hdfs-347.txt, local-reads-doc > > > One of the major strategies Hadoop uses to get scalable data processing is to > move the code to the data. However, putting the DFS client on the same > physical node as the data blocks it acts on doesn't improve read performance > as much as expected. > After looking at Hadoop and O/S traces (via HADOOP-4049), I think the problem > is due to the HDFS streaming protocol causing many more read I/O operations > (iops) than necessary. Consider the case of a DFSClient fetching a 64 MB > disk block from the DataNode process (running in a separate JVM) running on > the same machine. The DataNode will satisfy the single disk block request by > sending data back to the HDFS client in 64-KB chunks. In BlockSender.java, > this is done in the sendChunk() method, relying on Java's transferTo() > method. Depending on the host O/S and JVM implementation, transferTo() is > implemented as either a sendfilev() syscall or a pair of mmap() and write(). > In either case, each chunk is read from the disk by issuing a separate I/O > operation for each chunk. The result is that the single request for a 64-MB > block ends up hitting the disk as over a thousand smaller requests for 64-KB > each. > Since the DFSClient runs in a different JVM and process than the DataNode, > shuttling data from the disk to the DFSClient also results in context > switches each time network packets get sent (in this case, the 64-kb chunk > turns into a large number of 1500 byte packet send operations). Thus we see > a large number of context switches for each block send operation. > I'd like to get some feedback on the best way to address this, but I think > providing a mechanism for a DFSClient to directly open data blocks that > happen to be on the same machine. It could do this by examining the set of > LocatedBlocks returned by the NameNode, marking those that should be resident > on the local host. Since the DataNode and DFSClient (probably) share the > same hadoop configuration, the DFSClient should be able to find the files > holding the block data, and it could directly open them and send data back to > the client. This would avoid the context switches imposed by the network > layer, and would allow for much larger read buffers than 64KB, which should > reduce the number of iops imposed by each read block operation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532901#comment-13532901 ] Andy Isaacson commented on HDFS-4253: - bq. I don't see any reason why shuffle(a) could not be equal to shuffle(b), for two completely unrelated DatanodeIDs a and b. That's true, equality is possible. It's very unlikely given that we're choosing N items (where N is the replication count of a block, so nearly always 3, sometimes 10, possibly as absurdly high as 50) from the range of {{Random#NextInt}} which is about 2**32. The algorithm does something reasonable in the case that the shuffle has a collision (it puts the items in some order, either stable or not, and either result is fine for the rest of the algorithm). It would be possible to remove the possibility of collisions, but I don't know how to do that quickly with minimal code. So the current implementation seemed to strike a nice balance between the desired behavior, efficient and easily understandable code, and low algorithmic complexity. bq. It also seems better to just use hashCode, rather than creating your own random set of random ints associated with objects. It's important that we get a different answer each time {{pseudoSortByDistance}} is invoked; that randomization is what spreads the read load out across the replicas. So using a stable value like hashCode would defeat that goal of this change. (Possibly it might be true that hashCode ordering would be different in different {{DFSClient}} instances on different nodes, but I see no guarantee of that, and even if it's true, depending on such a subtle implementation detail would be dangerous. And it still doesn't resolve the issue that a single DFSClient should pick different replicas from a given class, for various reads of a given block.) > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532896#comment-13532896 ] Hadoop QA commented on HDFS-4253: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12561062/hdfs4253-2.txt against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3667//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3667//console This message is automatically generated. > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them
[ https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532895#comment-13532895 ] liang xie commented on HDFS-3429: - and the hbase-secific issue is : HBASE-5074 , fixed at 0.94.0 > DataNode reads checksums even if client does not need them > -- > > Key: HDFS-3429 > URL: https://issues.apache.org/jira/browse/HDFS-3429 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.0.0-alpha >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, > hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt > > > Currently, even if the client does not want to verify checksums, the datanode > reads them anyway and sends them over the wire. This means that performance > improvements like HBase's application-level checksums don't have much benefit > when reading through the datanode, since the DN is still causing seeks into > the checksum file. > (Credit goes to Dhruba for discovering this - filing on his behalf) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them
[ https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532891#comment-13532891 ] liang xie commented on HDFS-3429: - O, [~tlipcon], you missed my words: "without patch" the strace showed the statistic without patch. After applied the patch, i could not see so much meta files be opened > DataNode reads checksums even if client does not need them > -- > > Key: HDFS-3429 > URL: https://issues.apache.org/jira/browse/HDFS-3429 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.0.0-alpha >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, > hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt > > > Currently, even if the client does not want to verify checksums, the datanode > reads them anyway and sends them over the wire. This means that performance > improvements like HBase's application-level checksums don't have much benefit > when reading through the datanode, since the DN is still causing seeks into > the checksum file. > (Credit goes to Dhruba for discovering this - filing on his behalf) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532864#comment-13532864 ] Colin Patrick McCabe commented on HDFS-4253: Thanks for clarifying that. I still think there's a problem, though-- I don't see any reason why shuffle(a) could not be equal to shuffle(b), for two completely unrelated DatanodeIDs a and b. This could be fixed by checking something that's supposed to be unique in the case where the two agree-- like the name field. It also seems better to just use {{hashCode}}, rather than creating your own random set of random ints associated with objects. > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access
[ https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532860#comment-13532860 ] Eli Collins commented on HDFS-4315: --- Nice find! +1 pending jenkins > DNs with multiple BPs can have BPOfferServices fail to start due to > unsynchronized map access > - > > Key: HDFS-4315 > URL: https://issues.apache.org/jira/browse/HDFS-4315 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.0.2-alpha >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-4315.patch > > > In some nightly test runs we've seen pretty frequent failures of > TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an > unsynchronized map access in the DataStorage class. > More details in the first comment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access
[ https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-4315: - Status: Patch Available (was: Open) > DNs with multiple BPs can have BPOfferServices fail to start due to > unsynchronized map access > - > > Key: HDFS-4315 > URL: https://issues.apache.org/jira/browse/HDFS-4315 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.0.2-alpha >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-4315.patch > > > In some nightly test runs we've seen pretty frequent failures of > TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an > unsynchronized map access in the DataStorage class. > More details in the first comment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access
[ https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aaron T. Myers updated HDFS-4315: - Attachment: HDFS-4315.patch Here's a patch which addresses the issue. I've been looping the test for an hour now with no failures, whereas before it used to fail pretty reliably within 10 minutes. I'll keep it looping over the weekend and see how it goes. This patch also takes the liberty of re-enabling the DN log in TestWebHdfsWithMultipleNameNodes, so that we can better see the root cause of later failures. > DNs with multiple BPs can have BPOfferServices fail to start due to > unsynchronized map access > - > > Key: HDFS-4315 > URL: https://issues.apache.org/jira/browse/HDFS-4315 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.0.2-alpha >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > Attachments: HDFS-4315.patch > > > In some nightly test runs we've seen pretty frequent failures of > TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an > unsynchronized map access in the DataStorage class. > More details in the first comment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access
[ https://issues.apache.org/jira/browse/HDFS-4315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532857#comment-13532857 ] Aaron T. Myers commented on HDFS-4315: -- In all of the failing test runs that I saw, the client would end up failing with an error like the following: {noformat} 2012-12-14 16:30:36,818 WARN hdfs.DFSClient (DFSOutputStream.java:run(562)) - DataStreamer Exception java.io.IOException: Failed to add a datanode. User may turn off this feature by setting dfs.client.block.write.replace-datanode-on-failure.policy in configuration, where the current policy is DEFAULT. (Nodes: current=[127.0.0.1:52552, 127.0.0.1:43557], original=[127.0.0.1:43557, 127.0.0.1:52552]) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:792) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.addDatanode2ExistingPipeline(DFSOutputStream.java:852) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.setupPipelineForAppendOrRecovery(DFSOutputStream.java:958) at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:469) {noformat} This suggests that either an entire DN or one of the BPOfferServices of one of the DNs was not starting correctly, or had not started by the time the client was trying to access it. Unfortunately, TestWebHdfsWithMultipleNameNodes disables the DN logger, so it wasn't obvious what was causing that problem. Upon changing the test to not disable the logger and looping the test, I would occasionally see an error like the following: {noformat} java.lang.NullPointerException at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:850) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:819) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:308) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:218) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:660) at java.lang.Thread.run(Thread.java:662) {noformat} This error would cause one of the BPOfferServices in one of the DNs to not come up. The reason for this is that concurrent, unsynchronized puts to the HashMap DataStorage#bpStorageMap results in undefined behavior, including previously-included entries no longer appearing to be in the map. > DNs with multiple BPs can have BPOfferServices fail to start due to > unsynchronized map access > - > > Key: HDFS-4315 > URL: https://issues.apache.org/jira/browse/HDFS-4315 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode >Affects Versions: 2.0.2-alpha >Reporter: Aaron T. Myers >Assignee: Aaron T. Myers > > In some nightly test runs we've seen pretty frequent failures of > TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an > unsynchronized map access in the DataStorage class. > More details in the first comment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4315) DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access
Aaron T. Myers created HDFS-4315: Summary: DNs with multiple BPs can have BPOfferServices fail to start due to unsynchronized map access Key: HDFS-4315 URL: https://issues.apache.org/jira/browse/HDFS-4315 Project: Hadoop HDFS Issue Type: Bug Components: datanode Affects Versions: 2.0.2-alpha Reporter: Aaron T. Myers Assignee: Aaron T. Myers In some nightly test runs we've seen pretty frequent failures of TestWebHdfsWithMultipleNameNodes. I've traced the root cause to an unsynchronized map access in the DataStorage class. More details in the first comment. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4314) failure to set sticky bit regression on branch-trunk-win
Chris Nauroth created HDFS-4314: --- Summary: failure to set sticky bit regression on branch-trunk-win Key: HDFS-4314 URL: https://issues.apache.org/jira/browse/HDFS-4314 Project: Hadoop HDFS Issue Type: Bug Components: hdfs-client Affects Versions: trunk-win Reporter: Chris Nauroth Assignee: Chris Nauroth Fix For: trunk-win The problem is visible by running {{TestDFSShell#testFilePermissions}}. The test fails on trying to set sticky bit. The problem is that branch-trunk-win accidentally merged in a branch-1 change in {{RawLocalFileSystem#setPermission}} to call {{FileUtil#setPermission}}, which sets permissions using Java {{File}} API. There is no way to set sticky bit through this API. We need to switch back to the trunk implementation of {{RawLocalFileSystem#setPermission}}, which uses either native code or a shell call to external chmod. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andy Isaacson updated HDFS-4253: Attachment: hdfs4253-2.txt Avoid extra a.equals(b) by checking {{aIsLocal && bIsLocal}} instead. On average for a given sort this will result in fewer calls to {{.equals()}}. > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253-2.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4253) block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance
[ https://issues.apache.org/jira/browse/HDFS-4253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532779#comment-13532779 ] Andy Isaacson commented on HDFS-4253: - bq. The bug comes later, where you always return 1 if neither Node is on the local rack. This is wrong; it violates anticommutation (see link). But that's not what the code does. If neither Node is on the local rack, then {{aIsLocalRack == bIsLocalRack}} and we use the shuffle for a total ordering, right here: {code} 858 if (aIsLocalRack == bIsLocalRack) { 859 int ai = shuffle.get(a), bi = shuffle.get(b); 860 if (ai < bi) { 861 return -1; 862 } else if (ai > bi) { 863 return 1; 864 } else { 865 return 0; 866 } {code} The final {{else}} is only reached when {{bIsLocalRack && !aIsLocalRack}}. So I'm pretty sure this implementation does satisfy anticommutation. > block replica reads get hot-spots due to NetworkTopology#pseudoSortByDistance > - > > Key: HDFS-4253 > URL: https://issues.apache.org/jira/browse/HDFS-4253 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs4253-1.txt, hdfs4253.txt > > > When many nodes (10) read from the same block simultaneously, we get > asymmetric distribution of read load. This can result in slow block reads > when one replica is serving most of the readers and the other replicas are > idle. The busy DN bottlenecks on its network link. > This is especially visible with large block sizes and high replica counts (I > reproduced the problem with {{-Ddfs.block.size=4294967296}} and replication > 5), but the same behavior happens on a small scale with normal-sized blocks > and replication=3. > The root of the problem is in {{NetworkTopology#pseudoSortByDistance}} which > explicitly does not try to spread traffic among replicas in a given rack -- > it only randomizes usage for off-rack replicas. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them
[ https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532724#comment-13532724 ] Todd Lipcon commented on HDFS-3429: --- Hi Liang. I'm not sure if 0.94.2 has the code right to take advantage of this new feature quite yet -- given you see a bunch of the .meta files being read, it seems like it doesn't. So, that would explain why you don't see a performance difference. > DataNode reads checksums even if client does not need them > -- > > Key: HDFS-3429 > URL: https://issues.apache.org/jira/browse/HDFS-3429 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.0.0-alpha >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, > hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt > > > Currently, even if the client does not want to verify checksums, the datanode > reads them anyway and sends them over the wire. This means that performance > improvements like HBase's application-level checksums don't have much benefit > when reading through the datanode, since the DN is still causing seeks into > the checksum file. > (Credit goes to Dhruba for discovering this - filing on his behalf) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3465) 2NN doesn't start with fs.defaultFS set to a viewfs URI unless service RPC address is also set
[ https://issues.apache.org/jira/browse/HDFS-3465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532686#comment-13532686 ] Joseph Kniest commented on HDFS-3465: - Hi, I am new to HDFS dev and I would like to take this issue as my first. It may take a while because it's my first issue and because of my schedule but I will do my best to be as prompt as possible. Thanks! > 2NN doesn't start with fs.defaultFS set to a viewfs URI unless service RPC > address is also set > -- > > Key: HDFS-3465 > URL: https://issues.apache.org/jira/browse/HDFS-3465 > Project: Hadoop HDFS > Issue Type: Bug > Components: federation, namenode >Affects Versions: 2.0.0-alpha >Reporter: Eli Collins > Labels: newbie > > Looks like the 2NN first tries servicerpc-address then falls back on > fs.defaultFS, which won't work in the case of federation since fs.defaultFS > doesn't refer to an RPC address. Instead, the 2NN should first check > servicerpc-address, then rpc-address, then fall back on fs.defaultFS. > {noformat} > Exception in thread "main" java.lang.IllegalArgumentException: Invalid > URI for NameNode address (check fs.defaultFS): viewfs:/// has no > authority. >at > org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:315) >at > org.apache.hadoop.hdfs.server.namenode.NameNode.getAddress(NameNode.java:303) >at > org.apache.hadoop.hdfs.server.namenode.NameNode.getServiceAddress(NameNode.java:296) >at > org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:214) >at > org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:178) >at > org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:582) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4313) MiniDFSCluster throws NPE if umask is more permissive than 022
Luke Lu created HDFS-4313: - Summary: MiniDFSCluster throws NPE if umask is more permissive than 022 Key: HDFS-4313 URL: https://issues.apache.org/jira/browse/HDFS-4313 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 1.1.1 Reporter: Luke Lu Priority: Minor MiniDFSCluster startup throws NPE if umask is more permissive e.g. 002 than 022. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing
[ https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532495#comment-13532495 ] Harsh J commented on HDFS-3912: --- bq. Are you sure? It's committed in branch-1? Yes, branch-1 has this as a backport commit, whose different patch is attached as well. > Detecting and avoiding stale datanodes for writing > -- > > Key: HDFS-3912 > URL: https://issues.apache.org/jira/browse/HDFS-3912 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: 1.2.0, 2.0.3-alpha > > Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, > HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, > HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, > HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, > HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch > > > 1. Make stale timeout adaptive to the number of nodes marked stale in the > cluster. > 2. Consider having a separate configuration for write skipping the stale > nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing
[ https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532493#comment-13532493 ] Harsh J commented on HDFS-3912: --- bq. FYI: This patch is missing the branch-2 patch. After applying HDFS-3703 for branch-2, it's missing the DFS_NAMENODE_CHECK_STALE_DATANODE_DEFAULT settings, etc.. The diff may be dependent on the JIRA you mention, but perhaps not the patch itself. We merged the trunk commit directly into branch-2, as viewable/downloadable here: view at http://svn.apache.org/viewvc?view=revision&revision=1397219 and download at http://svn.apache.org/viewvc/hadoop/common/branches/branch-2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/DFSConfigKeys.java?revision=1397219&view=co If you use git locally, you can also add a remote and cherry-pick it out I guess. > Detecting and avoiding stale datanodes for writing > -- > > Key: HDFS-3912 > URL: https://issues.apache.org/jira/browse/HDFS-3912 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: 1.2.0, 2.0.3-alpha > > Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, > HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, > HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, > HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, > HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch > > > 1. Make stale timeout adaptive to the number of nodes marked stale in the > cluster. > 2. Consider having a separate configuration for write skipping the stale > nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing
[ https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532488#comment-13532488 ] nkeywal commented on HDFS-3912: --- Are you sure? It's committed in branch-1? > Detecting and avoiding stale datanodes for writing > -- > > Key: HDFS-3912 > URL: https://issues.apache.org/jira/browse/HDFS-3912 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: 1.2.0, 2.0.3-alpha > > Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, > HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, > HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, > HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, > HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch > > > 1. Make stale timeout adaptive to the number of nodes marked stale in the > cluster. > 2. Consider having a separate configuration for write skipping the stale > nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing
[ https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532482#comment-13532482 ] Jeremy Carroll commented on HDFS-3912: -- Basically this patch requires HDFS-3601 (Version 3.0). So there is no Branch 2.0 patch on the ticket. > Detecting and avoiding stale datanodes for writing > -- > > Key: HDFS-3912 > URL: https://issues.apache.org/jira/browse/HDFS-3912 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: 1.2.0, 2.0.3-alpha > > Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, > HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, > HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, > HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, > HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch > > > 1. Make stale timeout adaptive to the number of nodes marked stale in the > cluster. > 2. Consider having a separate configuration for write skipping the stale > nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3912) Detecting and avoiding stale datanodes for writing
[ https://issues.apache.org/jira/browse/HDFS-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532457#comment-13532457 ] Jeremy Carroll commented on HDFS-3912: -- FYI: This patch is missing the branch-2 patch. After applying HDFS-3703 for branch-2, it's missing the DFS_NAMENODE_CHECK_STALE_DATANODE_DEFAULT settings, etc.. > Detecting and avoiding stale datanodes for writing > -- > > Key: HDFS-3912 > URL: https://issues.apache.org/jira/browse/HDFS-3912 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: 3.0.0 >Reporter: Jing Zhao >Assignee: Jing Zhao > Fix For: 1.2.0, 2.0.3-alpha > > Attachments: HDFS-3912.001.patch, HDFS-3912.002.patch, > HDFS-3912.003.patch, HDFS-3912.004.patch, HDFS-3912.005.patch, > HDFS-3912.006.patch, HDFS-3912.007.patch, HDFS-3912.008.patch, > HDFS-3912.009.patch, HDFS-3912-010.patch, HDFS-3912-branch-1.1-001.patch, > HDFS-3912-branch-1.patch, HDFS-3912.branch-1.patch > > > 1. Make stale timeout adaptive to the number of nodes marked stale in the > cluster. > 2. Consider having a separate configuration for write skipping the stale > nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
[ https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532388#comment-13532388 ] Hadoop QA commented on HDFS-4312: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560969/HDFS-4312.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3666//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3666//console This message is automatically generated. > fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc > -- > > Key: HDFS-4312 > URL: https://issues.apache.org/jira/browse/HDFS-4312 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Attachments: HDFS-4312.patch > > > TestSecureNameNode does not work on Java6 without > "dfs.web.authentication.kerberos.principal" config property set. > Also the following improved: > 1) keytab files are checked for existence and readability to provide > fast-fail on config error. > 2) added comment to TestSecureNameNode describing the required sys props. > 3) string literals replaced with config constants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4310) fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode
[ https://issues.apache.org/jira/browse/HDFS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532316#comment-13532316 ] Hudson commented on HDFS-4310: -- Integrated in Hadoop-Mapreduce-trunk #1285 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1285/]) HDFS-4310. fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode Contributed by Ivan A. Veselovsky. (Revision 1421560) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421560 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java > fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode > --- > > Key: HDFS-4310 > URL: https://issues.apache.org/jira/browse/HDFS-4310 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Fix For: 3.0.0 > > Attachments: HDFS-4310.patch > > > the test org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode > catches exceptions and does not re-throw them. Due to that it passes even if > it actually failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4307) SocketCache should use monotonic time
[ https://issues.apache.org/jira/browse/HDFS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532313#comment-13532313 ] Hudson commented on HDFS-4307: -- Integrated in Hadoop-Mapreduce-trunk #1285 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1285/]) HDFS-4307. SocketCache should use monotonic time. Contributed by Colin Patrick McCabe. (Revision 1421572) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java > SocketCache should use monotonic time > - > > Key: HDFS-4307 > URL: https://issues.apache.org/jira/browse/HDFS-4307 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Fix For: 2.0.3-alpha > > Attachments: HDFS-4307.001.patch, HDFS-4307.002.patch > > > {{SocketCache}} should use monotonic time, not wall-clock time. Otherwise, > if the time is adjusted by ntpd or a system administrator, sockets could be > either abrupbtly expired, or left in the cache indefinitely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
[ https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan A. Veselovsky updated HDFS-4312: - Affects Version/s: 3.0.0 Status: Patch Available (was: Open) > fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc > -- > > Key: HDFS-4312 > URL: https://issues.apache.org/jira/browse/HDFS-4312 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Attachments: HDFS-4312.patch > > > TestSecureNameNode does not work on Java6 without > "dfs.web.authentication.kerberos.principal" config property set. > Also the following improved: > 1) keytab files are checked for existence and readability to provide > fast-fail on config error. > 2) added comment to TestSecureNameNode describing the required sys props. > 3) string literals replaced with config constants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
[ https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan A. Veselovsky updated HDFS-4312: - Attachment: HDFS-4312.patch > fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc > -- > > Key: HDFS-4312 > URL: https://issues.apache.org/jira/browse/HDFS-4312 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Attachments: HDFS-4312.patch > > > TestSecureNameNode does not work on Java6 without > "dfs.web.authentication.kerberos.principal" config property set. > Also the following improved: > 1) keytab files are checked for existence and readability to provide > fast-fail on config error. > 2) added comment to TestSecureNameNode describing the required sys props. > 3) string literals replaced with config constants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc
[ https://issues.apache.org/jira/browse/HDFS-4312?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan A. Veselovsky updated HDFS-4312: - Summary: fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc (was: fix test TestSecureNameNode and improve test TestSecureNameNode) > fix test TestSecureNameNode and improve test TestSecureNameNodeWithExternalKdc > -- > > Key: HDFS-4312 > URL: https://issues.apache.org/jira/browse/HDFS-4312 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > > TestSecureNameNode does not work on Java6 without > "dfs.web.authentication.kerberos.principal" config property set. > Also the following improved: > 1) keytab files are checked for existence and readability to provide > fast-fail on config error. > 2) added comment to TestSecureNameNode describing the required sys props. > 3) string literals replaced with config constants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-4312) fix test TestSecureNameNode and improve test TestSecureNameNode
Ivan A. Veselovsky created HDFS-4312: Summary: fix test TestSecureNameNode and improve test TestSecureNameNode Key: HDFS-4312 URL: https://issues.apache.org/jira/browse/HDFS-4312 Project: Hadoop HDFS Issue Type: Bug Reporter: Ivan A. Veselovsky Assignee: Ivan A. Veselovsky TestSecureNameNode does not work on Java6 without "dfs.web.authentication.kerberos.principal" config property set. Also the following improved: 1) keytab files are checked for existence and readability to provide fast-fail on config error. 2) added comment to TestSecureNameNode describing the required sys props. 3) string literals replaced with config constants. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4309) Multithreaded get through the Cache FileSystem Object to lead LeaseChecker memory leak
[ https://issues.apache.org/jira/browse/HDFS-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532291#comment-13532291 ] ChenFolin commented on HDFS-4309: - Hi Aaron T. Myers, When I execute "dev-support/test-patch.sh patch",that causes many errors,such as: "org.apache.hadoop.record.RecordComparator is deprecated." and the code is: {code} @Deprecated @InterfaceAudience.Public @InterfaceStability.Stable public abstract class RecordComparator extends WritableComparator { {code} So,"dev-support/test-patch.sh patch" exec failed.And now,how can I do for it? == == Determining number of patched javac warnings. == == mvn clean test -DskipTests -DHadoopPatchProcess -Pnative -Ptest-patch > /tmp/patchJavacWarnings.txt 2>&1 {color:red}-1 overall{color}. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color:red}. The patch appears to cause the build to fail. == == Finished build. == == > Multithreaded get through the Cache FileSystem Object to lead LeaseChecker > memory leak > -- > > Key: HDFS-4309 > URL: https://issues.apache.org/jira/browse/HDFS-4309 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs-client >Affects Versions: 0.20.205.0, 0.23.1, 0.23.4, 2.0.1-alpha, 2.0.2-alpha >Reporter: MaWenJin > Labels: patch > Attachments: jmap2.log > > Original Estimate: 204h > Remaining Estimate: 204h > > If multiple threads concurrently execute the following methods will result in > the thread fs = createFileSystem (uri, conf) method is called.And create > multiple DFSClient, start at the same time LeaseChecker daemon thread, may > not be able to use shutdownhook close it after the process, resulting in a > memory leak. > {code} > private FileSystem getInternal(URI uri, Configuration conf, Key key) throws > IOException{ > FileSystem fs = null; > synchronized (this) { > fs = map.get(key); > } > if (fs != null) { > return fs; > } > // this is > fs = createFileSystem(uri, conf); > synchronized (this) { // refetch the lock again > FileSystem oldfs = map.get(key); > if (oldfs != null) { // a file system is created while lock is > releasing > fs.close(); // close the new file system > return oldfs; // return the old file system > } > // now insert the new file system into the map > if (map.isEmpty() && !clientFinalizer.isAlive()) { > Runtime.getRuntime().addShutdownHook(clientFinalizer); > } > fs.key = key; > map.put(key, fs); > if (conf.getBoolean("fs.automatic.close", true)) { > toAutoClose.add(key); > } > return fs; > } > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4310) fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode
[ https://issues.apache.org/jira/browse/HDFS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532289#comment-13532289 ] Hudson commented on HDFS-4310: -- Integrated in Hadoop-Hdfs-trunk #1254 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1254/]) HDFS-4310. fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode Contributed by Ivan A. Veselovsky. (Revision 1421560) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421560 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java > fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode > --- > > Key: HDFS-4310 > URL: https://issues.apache.org/jira/browse/HDFS-4310 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Fix For: 3.0.0 > > Attachments: HDFS-4310.patch > > > the test org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode > catches exceptions and does not re-throw them. Due to that it passes even if > it actually failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4307) SocketCache should use monotonic time
[ https://issues.apache.org/jira/browse/HDFS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532286#comment-13532286 ] Hudson commented on HDFS-4307: -- Integrated in Hadoop-Hdfs-trunk #1254 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1254/]) HDFS-4307. SocketCache should use monotonic time. Contributed by Colin Patrick McCabe. (Revision 1421572) Result = FAILURE atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java > SocketCache should use monotonic time > - > > Key: HDFS-4307 > URL: https://issues.apache.org/jira/browse/HDFS-4307 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Fix For: 2.0.3-alpha > > Attachments: HDFS-4307.001.patch, HDFS-4307.002.patch > > > {{SocketCache}} should use monotonic time, not wall-clock time. Otherwise, > if the time is adjusted by ntpd or a system administrator, sockets could be > either abrupbtly expired, or left in the cache indefinitely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4307) SocketCache should use monotonic time
[ https://issues.apache.org/jira/browse/HDFS-4307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532229#comment-13532229 ] Hudson commented on HDFS-4307: -- Integrated in Hadoop-Yarn-trunk #65 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/65/]) HDFS-4307. SocketCache should use monotonic time. Contributed by Colin Patrick McCabe. (Revision 1421572) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421572 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/SocketCache.java > SocketCache should use monotonic time > - > > Key: HDFS-4307 > URL: https://issues.apache.org/jira/browse/HDFS-4307 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.0.3-alpha >Reporter: Colin Patrick McCabe >Assignee: Colin Patrick McCabe >Priority: Minor > Fix For: 2.0.3-alpha > > Attachments: HDFS-4307.001.patch, HDFS-4307.002.patch > > > {{SocketCache}} should use monotonic time, not wall-clock time. Otherwise, > if the time is adjusted by ntpd or a system administrator, sockets could be > either abrupbtly expired, or left in the cache indefinitely. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4310) fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode
[ https://issues.apache.org/jira/browse/HDFS-4310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532232#comment-13532232 ] Hudson commented on HDFS-4310: -- Integrated in Hadoop-Yarn-trunk #65 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/65/]) HDFS-4310. fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode Contributed by Ivan A. Veselovsky. (Revision 1421560) Result = SUCCESS atm : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1421560 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode.java > fix test org.apache.hadoop.hdfs.server.datanode.TestStartSecureDataNode > --- > > Key: HDFS-4310 > URL: https://issues.apache.org/jira/browse/HDFS-4310 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0 >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Fix For: 3.0.0 > > Attachments: HDFS-4310.patch > > > the test org/apache/hadoop/hdfs/server/datanode/TestStartSecureDataNode > catches exceptions and does not re-throw them. Due to that it passes even if > it actually failed. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
[ https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532195#comment-13532195 ] Hadoop QA commented on HDFS-4311: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560935/HDFS-4311.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 2 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs-httpfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3665//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3665//console This message is automatically generated. > repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos > --- > > Key: HDFS-4311 > URL: https://issues.apache.org/jira/browse/HDFS-4311 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.3-alpha >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Attachments: HDFS-4311.patch > > > Some of the test cases in this test class are failing because they are > affected by static state changed by the previous test cases. Namely this is > the static field org.apache.hadoop.security.UserGroupInformation.loginUser . > The suggested patch solves this problem. > Besides, the following improvements are done: > 1) parametrized the user principal and keytab values via system properties; > 2) shutdown of the Jetty server and the minicluster between the test cases is > added to make the test methods independent on each other. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
[ https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan A. Veselovsky updated HDFS-4311: - Affects Version/s: 2.0.3-alpha 3.0.0 Status: Patch Available (was: Open) > repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos > --- > > Key: HDFS-4311 > URL: https://issues.apache.org/jira/browse/HDFS-4311 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 3.0.0, 2.0.3-alpha >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Attachments: HDFS-4311.patch > > > Some of the test cases in this test class are failing because they are > affected by static state changed by the previous test cases. Namely this is > the static field org.apache.hadoop.security.UserGroupInformation.loginUser . > The suggested patch solves this problem. > Besides, the following improvements are done: > 1) parametrized the user principal and keytab values via system properties; > 2) shutdown of the Jetty server and the minicluster between the test cases is > added to make the test methods independent on each other. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
[ https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan A. Veselovsky updated HDFS-4311: - Attachment: HDFS-4311.patch > repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos > --- > > Key: HDFS-4311 > URL: https://issues.apache.org/jira/browse/HDFS-4311 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > Attachments: HDFS-4311.patch > > > Some of the test cases in this test class are failing because they are > affected by static state changed by the previous test cases. Namely this is > the static field org.apache.hadoop.security.UserGroupInformation.loginUser . > The suggested patch solves this problem. > Besides, the following improvements are done: > 1) parametrized the user principal and keytab values via system properties; > 2) shutdown of the Jetty server and the minicluster between the test cases is > added to make the test methods independent on each other. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Moved] (HDFS-4311) repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos
[ https://issues.apache.org/jira/browse/HDFS-4311?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ivan A. Veselovsky moved HADOOP-9143 to HDFS-4311: -- Key: HDFS-4311 (was: HADOOP-9143) Project: Hadoop HDFS (was: Hadoop Common) > repair test org.apache.hadoop.fs.http.server.TestHttpFSWithKerberos > --- > > Key: HDFS-4311 > URL: https://issues.apache.org/jira/browse/HDFS-4311 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Ivan A. Veselovsky >Assignee: Ivan A. Veselovsky > > Some of the test cases in this test class are failing because they are > affected by static state changed by the previous test cases. Namely this is > the static field org.apache.hadoop.security.UserGroupInformation.loginUser . > The suggested patch solves this problem. > Besides, the following improvements are done: > 1) parametrized the user principal and keytab values via system properties; > 2) shutdown of the Jetty server and the minicluster between the test cases is > added to make the test methods independent on each other. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3429) DataNode reads checksums even if client does not need them
[ https://issues.apache.org/jira/browse/HDFS-3429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532161#comment-13532161 ] liang xie commented on HDFS-3429: - still no obvious difference be found at another 100%read scenario withou IO-bound i did "strace -p -f -tt -T -e trace=file -o bbb" during a several minutes run(without patch),then: grep "current/finalized" bbb|wc -l 16905 grep meta bbb|wc -l 9858 grep meta bbb|grep open|wc -l 3286 grep meta bbb|grep stat|wc -l 6572 grep meta bbb|grep "\".*\"" -o|sort -n |uniq -c|wc -l 303 And most of those meta files size are several hundred of kilobytes, further more, our OS has a default read_ahead_kb: 128 so the benefit was not obvious seems make sense as well. Any idea, [~tlipcon] ? But i am +1 for this patch, due to it can reduce some unnecessary IO & system call > DataNode reads checksums even if client does not need them > -- > > Key: HDFS-3429 > URL: https://issues.apache.org/jira/browse/HDFS-3429 > Project: Hadoop HDFS > Issue Type: Bug > Components: datanode, performance >Affects Versions: 2.0.0-alpha >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: hdfs-3429-0.20.2.patch, hdfs-3429-0.20.2.patch, > hdfs-3429.txt, hdfs-3429.txt, hdfs-3429.txt > > > Currently, even if the client does not want to verify checksums, the datanode > reads them anyway and sends them over the wire. This means that performance > improvements like HBase's application-level checksums don't have much benefit > when reading through the datanode, since the DN is still causing seeks into > the checksum file. > (Credit goes to Dhruba for discovering this - filing on his behalf) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-4140) fuse-dfs handles open(O_TRUNC) poorly
[ https://issues.apache.org/jira/browse/HDFS-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13532152#comment-13532152 ] Hadoop QA commented on HDFS-4140: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12560910/HDFS-4140.008.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestPersistBlocks org.apache.hadoop.hdfs.server.blockmanagement.TestRBWBlockInvalidation {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3664//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3664//console This message is automatically generated. > fuse-dfs handles open(O_TRUNC) poorly > - > > Key: HDFS-4140 > URL: https://issues.apache.org/jira/browse/HDFS-4140 > Project: Hadoop HDFS > Issue Type: Bug > Components: fuse-dfs >Affects Versions: 2.0.2-alpha >Reporter: Andy Isaacson >Assignee: Colin Patrick McCabe > Attachments: HDFS-4140.003.patch, HDFS-4140.004.patch, > HDFS-4140.005.patch, HDFS-4140.006.patch, HDFS-4140.007.patch, > HDFS-4140.008.patch > > > fuse-dfs handles open(O_TRUNC) poorly. > It is converted to multiple fuse operations. Those multiple fuse operations > often fail (for example, calling fuse_truncate_impl() while a file is also > open for write results in a "multiple writers!" exception.) > One easy way to see the problem is to run the following sequence of shell > commands: > {noformat} > ubuntu@ubu-cdh-0:~$ echo foo > /export/hdfs/tmp/a/t1.txt > ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a > total 0 > -rw-r--r-- 1 ubuntu hadoop 4 Nov 1 15:21 t1.txt > ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a > Found 1 items > -rw-r--r-- 3 ubuntu hadoop 4 2012-11-01 15:21 /tmp/a/t1.txt > ubuntu@ubu-cdh-0:~$ echo bar > /export/hdfs/tmp/a/t1.txt > ubuntu@ubu-cdh-0:~$ ls -l /export/hdfs/tmp/a > total 0 > -rw-r--r-- 1 ubuntu hadoop 0 Nov 1 15:22 t1.txt > ubuntu@ubu-cdh-0:~$ hdfs dfs -ls /tmp/a > Found 1 items > -rw-r--r-- 3 ubuntu hadoop 0 2012-11-01 15:22 /tmp/a/t1.txt > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira