[jira] Commented: (HDFS-970) FSImage writing should always fsync before close
[ https://issues.apache.org/jira/browse/HDFS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867423#action_12867423 ] Todd Lipcon commented on HDFS-970: -- To prove that this is indeed absolutely necessary, I performed the following test on my desktop (2.6.31): 1) Create a loop device with 1G storage {code}# dd if=/dev/zero of=myloop bs=1M count=1000 # losetup -f myloop{code} 2) Make a faulty type md array: {code}# mdadm --create /dev/md0 --level=faulty --raid-devices=1 /dev/loop1{code} 3) format it as ext4 {code}# mkfs.ext4 /dev/md0{code} 4) mount it in /mnt {code}# mount -t ext4 /dev/md0 /mnt{code} 5) run the following python script: {code} #!/usr/bin/env python import os for idx in xrange(1, 10): f = file(file_%d_ckpt % idx, w) for line in xrange(0, 100): print f, hello world! this is line %d % line f.close() os.rename(file_%d_ckpt % idx, file_%d % idx) print Saved file %d % idx {code} 6) While running, block all writes to the disk (this essentially freezes the disk as if a power outage occurred): {code}# mdadm --grow /dev/md0 -l faulty -p write-all{code} Script output: {code} Saved file 1 Saved file 2 Saved file 3 Saved file 4 Saved file 5 Traceback (most recent call last): File /home/todd/disk-fault/test.py, line 7, in module print f, hello world! this is line %d % line IOError: [Errno 30] Read-only file system {code} [ext4 automatically remounts itself readonly] 7) umount /mnt, clear the fault with -p clear, remount /mnt 8) results of ls -l: {code} r...@todd-desktop:/mnt# ls -l total 16 -rw-r--r-- 1 root root 0 2010-05-13 23:11 file_1 -rw-r--r-- 1 root root 0 2010-05-13 23:11 file_2 -rw-r--r-- 1 root root 0 2010-05-13 23:11 file_3_ckpt drwx-- 2 root root 16384 2010-05-13 22:45 lost+found {code} I then modified the test script to add f.flush() and os.fsync(f.fileno()) right before the close(), and ran the exact same test. Results: {code} r...@todd-desktop:/mnt# ~/disk-fault/test.py Saved file 1 Saved file 2 Traceback (most recent call last): File /home/todd/disk-fault/test.py, line 9, in module os.fsync(f.fileno()) [umount, clear fault, remount] r...@todd-desktop:/mnt# ls -l total 66208 -rw-r--r-- 1 root root 3390 2010-05-13 23:20 file_1 -rw-r--r-- 1 root root 3390 2010-05-13 23:20 file_2_ckpt drwx-- 2 root root16384 2010-05-13 22:45 lost+found {code} I tried the same test on ext3, and without the fsync the files entirely disappeared. The same was true of xfs. Adding the fsync before close fixed the issue in all cases. FSImage writing should always fsync before close Key: HDFS-970 URL: https://issues.apache.org/jira/browse/HDFS-970 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Attachments: hdfs-970.txt Without an fsync, it's common that filesystems will delay the writing of metadata to the journal until all of the data blocks have been flushed. If the system crashes while the dirty pages haven't been flushed, the file is left in an indeterminate state. In some FSs (eg ext4) this will result in a 0-length file. In others (eg XFS) it will result in the correct length but any number of data blocks getting zeroed. Calling FileChannel.force before closing the FSImage prevents this issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-889) Possible race condition in BlocksMap.NodeIterator.
[ https://issues.apache.org/jira/browse/HDFS-889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867527#action_12867527 ] Steve Loughran commented on HDFS-889: - Is this just a test bug? I don't know. We've only seen it in tests, but that doesn't mean that it hasn't happened out in the field? Possible race condition in BlocksMap.NodeIterator. -- Key: HDFS-889 URL: https://issues.apache.org/jira/browse/HDFS-889 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.22.0 Reporter: Steve Loughran Hudson's test run for HDFS-165 is showing an NPE in {{org.apache.hadoop.hdfs.server.namenode.TestNodeCount.testNodeCount()}} One problem could be in {{BlocksMap.NodeIterator}}. It's {{hasNext()}} method checks the next entry isn't null. But what if between the {{hasNext() call and the next() operation, the map changes and an entry goes away? In that situation, the node returned from next() will be null. This is potentially serious as a quick look through the code shows that the iterator gets retrieved a lot and everywhere hadoop does so, it assumes the value is not null. It's also one of those problems that doesn't have a simple make it go away fix. Options # Ignore it, hope it doesn't happen very often and the test failing was a one off that will never happen in a production datacentre. This is the default. The iterator is only used in the namenode, so while it does depend on the # of datanodes, it isn't running in 4000 machines in a big cluster. # Leave the iterator as is, have all the in-Hadoop code check for a null-value and break the loop # Patch the {{NodeIterator}} to be consistent with the {{Iterator}} specification and throw a {{NoSuchElementException}} if the next value is null. This does not make the problem go away, but now it is handled by having every use in-Hadoop catching the exception at the right point and exiting the loop. Testing. This should be possible. # Create a block map # iterate over a block # while the iterator is in progress remove the next block in the list. Expect the next call to next() to fail in whatever way you choose. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1154) libhdfs::hdfsExists should return different value in case of 1. error and 2. file does not exist
[ https://issues.apache.org/jira/browse/HDFS-1154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867559#action_12867559 ] Eli Collins commented on HDFS-1154: --- Patch looks good. Would be good to run and add a test in the libhdfs test and {ant -Dcompile.c++=true -Dlibhdfs=true test}. libhdfs::hdfsExists should return different value in case of 1. error and 2. file does not exist Key: HDFS-1154 URL: https://issues.apache.org/jira/browse/HDFS-1154 Project: Hadoop HDFS Issue Type: Improvement Reporter: Zheng Shao Assignee: Zheng Shao Attachments: HDFS-1154.1.patch Now the code always return -1 for those cases. There is no way for the application know which is which. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests
[ https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867578#action_12867578 ] Hairong Kuang commented on HDFS-599: The reason all Protocols are available... I prefer that we have a clean separation of what requests each port can receive. Especially for client requests, they should not be allowed to be sent to the service port. This is because the HDFS admin has no control of which port a client uses to send its requests. If we ever allow the service port to receive a client's request, it will defeat the purpose of this jira. BTW, besides balancer, I think HDFS admin requests should also be sent to the service port. But I am OK that you do it in a different jira. Please also open a jira for my comment 3 (providing different configuration for different RPC servers). Something down the road we could consider is that do not start client rpc server until NN exits safemode, hence speed up NN startup. Improve Namenode robustness by prioritizing datanode heartbeats over client requests Key: HDFS-599 URL: https://issues.apache.org/jira/browse/HDFS-599 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-599.patch The namenode processes RPC requests from clients that are reading/writing to files as well as heartbeats/block reports from datanodes. Sometime, because of various reasons (Java GC runs, inconsistent performance of NFS filer that stores HDFS transacttion logs, etc), the namenode encounters transient slowness. For example, if the device that stores the HDFS transaction logs becomes sluggish, the Namenode's ability to process RPCs slows down to a certain extent. During this time, the RPCs from clients as well as the RPCs from datanodes suffer in similar fashion. If the underlying problem becomes worse, the NN's ability to process a heartbeat from a DN is severly impacted, thus causing the NN to declare that the DN is dead. Then the NN starts replicating blocks that used to reside on the now-declared-dead datanode. This adds extra load to the NN. Then the now-declared-datanode finally re-establishes contact with the NN, and sends a block report. The block report processing on the NN is another heavyweight activity, thus casing more load to the already overloaded namenode. My proposal is tha the NN should try its best to continue processing RPCs from datanodes and give lesser priority to serving client requests. The Datanode RPCs are integral to the consistency and performance of the Hadoop file system, and it is better to protect it at all costs. This will ensure that NN recovers from the hiccup much faster than what it does now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1155) getDatanodeReport can be moved to NameNodeProtocol from ClientProtocol
getDatanodeReport can be moved to NameNodeProtocol from ClientProtocol -- Key: HDFS-1155 URL: https://issues.apache.org/jira/browse/HDFS-1155 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Right now getDatanodeReport is being used in only two places in the HDFS code: the Balancer and DFSAdmin. And it is the only reason for these classes to use DFSClient. If we would move the method definition (or copy for now deprecating the old location) to the NameNode protocol DFSAdmin and Balancer will not rely on DFSClient anymore and will be cleaner. This will also help the Balancer to run cleaner against HDFS-599 changes. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-1155) getDatanodeReport can be moved to NameNodeProtocol from ClientProtocol
[ https://issues.apache.org/jira/browse/HDFS-1155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov reassigned HDFS-1155: --- Assignee: Dmytro Molkov getDatanodeReport can be moved to NameNodeProtocol from ClientProtocol -- Key: HDFS-1155 URL: https://issues.apache.org/jira/browse/HDFS-1155 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov Right now getDatanodeReport is being used in only two places in the HDFS code: the Balancer and DFSAdmin. And it is the only reason for these classes to use DFSClient. If we would move the method definition (or copy for now deprecating the old location) to the NameNode protocol DFSAdmin and Balancer will not rely on DFSClient anymore and will be cleaner. This will also help the Balancer to run cleaner against HDFS-599 changes. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1156) Make Balancer run on the service port
Make Balancer run on the service port - Key: HDFS-1156 URL: https://issues.apache.org/jira/browse/HDFS-1156 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov The balancer should only run against the service port of the HDFS once HDFS-599 makes it in -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-1156) Make Balancer run on the service port
[ https://issues.apache.org/jira/browse/HDFS-1156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov reassigned HDFS-1156: --- Assignee: Dmytro Molkov Make Balancer run on the service port - Key: HDFS-1156 URL: https://issues.apache.org/jira/browse/HDFS-1156 Project: Hadoop HDFS Issue Type: Improvement Reporter: Dmytro Molkov Assignee: Dmytro Molkov The balancer should only run against the service port of the HDFS once HDFS-599 makes it in -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1153) The navigation to /dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error
[ https://issues.apache.org/jira/browse/HDFS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867587#action_12867587 ] Eli Collins commented on HDFS-1153: --- The change looks good to me. The patch doesn't apply to the head of branch-20 so you may need to merge. The navigation to /dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error - Key: HDFS-1153 URL: https://issues.apache.org/jira/browse/HDFS-1153 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1, 0.20.2 Reporter: Ravi Phulari Assignee: Ravi Phulari Fix For: 0.20.3 Attachments: HDFS-1153.patch Navigation to dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1145) When NameNode is shutdown it tries to exit safemode
[ https://issues.apache.org/jira/browse/HDFS-1145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867588#action_12867588 ] dhruba borthakur commented on HDFS-1145: Can somebody pl review this minor patch? When NameNode is shutdown it tries to exit safemode --- Key: HDFS-1145 URL: https://issues.apache.org/jira/browse/HDFS-1145 Project: Hadoop HDFS Issue Type: Bug Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: NNsafemodeMonitor.txt, NNsafemodeMonitor.txt Suppose the NameNode is in safemode. Then we try to shuut it down by invoking NameNode.stop(). The stop() method interrupts all waiting threads, which in turn, causes the SafeMode monitor to exit and thus triggering replicating/deleting of blocks. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests
[ https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867600#action_12867600 ] Dmytro Molkov commented on HDFS-599: Hairong, thanks for your comments. Not starting RPC client server until we are out of safemode is the second patch that we have been running internally for a while now and I will port it to trunk as soon as this jira makes it in. I felt like adding both parts in one jira will be too huge. DFSAdmin in that case does have to run on the service port. The clean separation makes sense, but I do not think we can fully make that separation available in the case of this JIRA. The way to effectively administratively solve the problem of two ports is to firewall service ports from external clients + do not include information about this port in the mapreduce configuration. This way only HDFS cluster will have the information in the configuration and only the datanodes will be accessing it. This is the way we were operating internally at FB. This of course doesn't help solve the problem of malicious clients still accessing the service port by hacking the values in the code (since it should not be available in the configuration). However removing the ClientProtocol from the service port will effectively make it impossible for administrator to perform any client operations like LS, or even getting out of safemode (which is still in ClientProtocol) if we postpone the start of the client port until we are out of safemode. So essentially I feel like this problem can partly be solved by administrative measures and the value that we get from keeping the Client protocol and others available on the service port still outweigh the problem of malicious clients that might get in on that port. Improve Namenode robustness by prioritizing datanode heartbeats over client requests Key: HDFS-599 URL: https://issues.apache.org/jira/browse/HDFS-599 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-599.patch The namenode processes RPC requests from clients that are reading/writing to files as well as heartbeats/block reports from datanodes. Sometime, because of various reasons (Java GC runs, inconsistent performance of NFS filer that stores HDFS transacttion logs, etc), the namenode encounters transient slowness. For example, if the device that stores the HDFS transaction logs becomes sluggish, the Namenode's ability to process RPCs slows down to a certain extent. During this time, the RPCs from clients as well as the RPCs from datanodes suffer in similar fashion. If the underlying problem becomes worse, the NN's ability to process a heartbeat from a DN is severly impacted, thus causing the NN to declare that the DN is dead. Then the NN starts replicating blocks that used to reside on the now-declared-dead datanode. This adds extra load to the NN. Then the now-declared-datanode finally re-establishes contact with the NN, and sends a block report. The block report processing on the NN is another heavyweight activity, thus casing more load to the already overloaded namenode. My proposal is tha the NN should try its best to continue processing RPCs from datanodes and give lesser priority to serving client requests. The Datanode RPCs are integral to the consistency and performance of the Hadoop file system, and it is better to protect it at all costs. This will ensure that NN recovers from the hiccup much faster than what it does now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1103) Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed last chunk
[ https://issues.apache.org/jira/browse/HDFS-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867606#action_12867606 ] Hairong Kuang commented on HDFS-1103: - In the 0.21 append design, if every BlockReceiver could make sure that its buffered packet gets flushed to the disk before it exits on error, then I do not think the problem you described will happen. Probably the code does not enforce it now. I do not think that we should use the max of RBWs. In 0.21, there is no concept of validate length for RBWs. Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed last chunk -- Key: HDFS-1103 URL: https://issues.apache.org/jira/browse/HDFS-1103 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker Attachments: hdfs-1103-test.txt When the DN creates a replica under recovery, it calls validateIntegrity, which truncates the last checksum chunk off of a replica if it is found to be invalid. Then when the block recovery process happens, this shortened block wins over a longer replica from another node where there was no corruption. Thus, if just one of the DNs has an invalid last checksum chunk, data that has been sync()ed to other datanodes can be lost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1103) Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed last chunk
[ https://issues.apache.org/jira/browse/HDFS-1103?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867609#action_12867609 ] Hairong Kuang commented on HDFS-1103: - We should also exclude those RBWs that were failed on disk errors from lease recovery if there are good ones available. Replica recovery doesn't distinguish between flushed-but-corrupted last chunk and unflushed last chunk -- Key: HDFS-1103 URL: https://issues.apache.org/jira/browse/HDFS-1103 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.21.0, 0.22.0 Reporter: Todd Lipcon Priority: Blocker Attachments: hdfs-1103-test.txt When the DN creates a replica under recovery, it calls validateIntegrity, which truncates the last checksum chunk off of a replica if it is found to be invalid. Then when the block recovery process happens, this shortened block wins over a longer replica from another node where there was no corruption. Thus, if just one of the DNs has an invalid last checksum chunk, data that has been sync()ed to other datanodes can be lost. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1153) The navigation to /dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error
[ https://issues.apache.org/jira/browse/HDFS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867616#action_12867616 ] Ravi Phulari commented on HDFS-1153: Forgot to mention that this patch is for Y20s branch. The navigation to /dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error - Key: HDFS-1153 URL: https://issues.apache.org/jira/browse/HDFS-1153 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1, 0.20.2 Reporter: Ravi Phulari Assignee: Ravi Phulari Fix For: 0.20.3 Attachments: HDFS-1153.patch Navigation to dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1153) The navigation to /dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error
[ https://issues.apache.org/jira/browse/HDFS-1153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-1153: -- Hadoop Flags: [Reviewed] +1 The navigation to /dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error - Key: HDFS-1153 URL: https://issues.apache.org/jira/browse/HDFS-1153 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.20.1, 0.20.2 Reporter: Ravi Phulari Assignee: Ravi Phulari Fix For: 0.20.3 Attachments: HDFS-1153.patch Navigation to dfsnodelist.jsp with invalid input parameters produces NPE and HTTP 500 error. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1150) Verify datanodes' identities to clients in secure clusters
[ https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-1150: -- Attachment: HDFS-1150-Y20S-ready-6.patch Patch is ready to go, I believe. #6 Verify datanodes' identities to clients in secure clusters -- Key: HDFS-1150 URL: https://issues.apache.org/jira/browse/HDFS-1150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node Affects Versions: 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-1150-y20.build-script.patch, HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1021) specify correct server principal for RefreshAuthorizationPolicyProtocol and RefreshUserToGroupMappingsProtocol protocols in DFSAdmin (for HADOOP-6612)
[ https://issues.apache.org/jira/browse/HDFS-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867636#action_12867636 ] Boris Shkolnik commented on HDFS-1021: -- run tests manually. all passed. specify correct server principal for RefreshAuthorizationPolicyProtocol and RefreshUserToGroupMappingsProtocol protocols in DFSAdmin (for HADOOP-6612) -- Key: HDFS-1021 URL: https://issues.apache.org/jira/browse/HDFS-1021 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: HDFS-1021.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1021) specify correct server principal for RefreshAuthorizationPolicyProtocol and RefreshUserToGroupMappingsProtocol protocols in DFSAdmin (for HADOOP-6612)
[ https://issues.apache.org/jira/browse/HDFS-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867640#action_12867640 ] Boris Shkolnik commented on HDFS-1021: -- committed to trunk. specify correct server principal for RefreshAuthorizationPolicyProtocol and RefreshUserToGroupMappingsProtocol protocols in DFSAdmin (for HADOOP-6612) -- Key: HDFS-1021 URL: https://issues.apache.org/jira/browse/HDFS-1021 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: HDFS-1021.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1143) Implement Background deletion
[ https://issues.apache.org/jira/browse/HDFS-1143?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867658#action_12867658 ] Scott Chen commented on HDFS-1143: -- Hey Koji, HDFS-173 help other clients because it releases the lock when removing blocks from time to time. It is very nice. I think there is still room to improve. 1. collectSubtreeBlocksAndClear() is called in side the global lock. This is not necessary because once we called removeChild() the subtree is not referenced by outside. It is OK to do it without lock. Avoiding holding this lock improves the efficiency. 2. The client who performs the deletion do not have to wait the blocks to be empty. Once the node is removed from the iNode tree and file lease is cleared. The deletion should be considered finished. The rest can be moved to the background. This way the client who dose deletion will get better response. I think what Dhruba says make sense. To be more specific, we can 1. Do removeChild(), do removeLeaseWithPrefixPath() and just launch the background cleanup task 2. In the background task, a. Do collectSubtreeBlocksAndClear() without any lock b. Hold the global lock and delete blocks in small batches to avoid holding the lock too long I think the bottom line is that we should just leave the atomic operations in the lock and move everything else in background. Implement Background deletion - Key: HDFS-1143 URL: https://issues.apache.org/jira/browse/HDFS-1143 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Affects Versions: 0.22.0 Reporter: Dmytro Molkov Assignee: Scott Chen Fix For: 0.22.0 Right now if you try to delete massive number of files from the namenode it will freeze (sometimes for minutes). Most of the time is spent going through the blocks map and invalidating all the blocks. This can probably be improved by having a background GC process. The deletion will basically just remove the inode being deleted and then give the subtree that was just deleted to the background thread running cleanup. This way the namenode becomes available for the clients soon after deletion, and all the heavy operations are done in the background. Thoughts? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-292) TestDFSIO Should print stats for each file
[ https://issues.apache.org/jira/browse/HDFS-292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867670#action_12867670 ] Konstantin Shvachko commented on HDFS-292: -- If it writes/reads thousands of files it would be a pretty long printout. Should we just log per-file data or should we write it in the output file? TestDFSIO Should print stats for each file -- Key: HDFS-292 URL: https://issues.apache.org/jira/browse/HDFS-292 Project: Hadoop HDFS Issue Type: Improvement Reporter: Raghu Angadi TestDFSIO is a pretty useful micro benchmark for testing reading and writing in DFS. When we want to analyze its results later, it would be more useful if the test prints out stats for each file/map along with the aggregated stats. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1002) Secondary Name Node crash, NPE in edit log replay
[ https://issues.apache.org/jira/browse/HDFS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867681#action_12867681 ] Tom White commented on HDFS-1002: - Todd, the patch no longer applies. Can you regenerate please? Secondary Name Node crash, NPE in edit log replay - Key: HDFS-1002 URL: https://issues.apache.org/jira/browse/HDFS-1002 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: ryan rawson Priority: Blocker Fix For: 0.21.0 Attachments: addChildNPE.patch, snn_crash.tar.gz, snn_log.txt An NPE in SNN, the core of the message looks like yay so: 2010-02-25 11:54:05,834 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1164) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1067) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:213) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:511) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:401) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:368) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1172) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:594) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:476) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:353) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:317) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:219) at java.lang.Thread.run(Thread.java:619) This happens even if I restart SNN over and over again. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1019) Incorrect default values for delegation tokens in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1019: --- Status: Open (was: Patch Available) Incorrect default values for delegation tokens in hdfs-default.xml -- Key: HDFS-1019 URL: https://issues.apache.org/jira/browse/HDFS-1019 Project: Hadoop HDFS Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HDFS-1019-y20.1.patch, HDFS-1019.1.patch, HDFS-1019.2.patch, HDFS-1019.3.patch The default values for delegation token parameters in hdfs-default.xml are incorrect. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1019) Incorrect default values for delegation tokens in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1019: --- Status: Patch Available (was: Open) Incorrect default values for delegation tokens in hdfs-default.xml -- Key: HDFS-1019 URL: https://issues.apache.org/jira/browse/HDFS-1019 Project: Hadoop HDFS Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HDFS-1019-y20.1.patch, HDFS-1019.1.patch, HDFS-1019.2.patch, HDFS-1019.3.patch The default values for delegation token parameters in hdfs-default.xml are incorrect. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1002) Secondary Name Node crash, NPE in edit log replay
[ https://issues.apache.org/jira/browse/HDFS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867687#action_12867687 ] Todd Lipcon commented on HDFS-1002: --- Hey Tom, I think Konstantin was referring to the patch on HDFS-909 which got committed. I think this jira can probably be resolved as invalid (ie, it is a result of other bugs that cause corruption, not a bug on its own) Secondary Name Node crash, NPE in edit log replay - Key: HDFS-1002 URL: https://issues.apache.org/jira/browse/HDFS-1002 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: ryan rawson Priority: Blocker Fix For: 0.21.0 Attachments: addChildNPE.patch, snn_crash.tar.gz, snn_log.txt An NPE in SNN, the core of the message looks like yay so: 2010-02-25 11:54:05,834 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1164) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1067) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:213) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:511) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:401) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:368) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1172) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:594) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:476) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:353) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:317) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:219) at java.lang.Thread.run(Thread.java:619) This happens even if I restart SNN over and over again. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-1002) Secondary Name Node crash, NPE in edit log replay
[ https://issues.apache.org/jira/browse/HDFS-1002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-1002. --- Resolution: Invalid Secondary Name Node crash, NPE in edit log replay - Key: HDFS-1002 URL: https://issues.apache.org/jira/browse/HDFS-1002 Project: Hadoop HDFS Issue Type: Bug Affects Versions: 0.21.0 Reporter: ryan rawson Priority: Blocker Fix For: 0.21.0 Attachments: addChildNPE.patch, snn_crash.tar.gz, snn_log.txt An NPE in SNN, the core of the message looks like yay so: 2010-02-25 11:54:05,834 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: java.lang.NullPointerException at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1152) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addChild(FSDirectory.java:1164) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.addNode(FSDirectory.java:1067) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.unprotectedAddFile(FSDirectory.java:213) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadEditRecords(FSEditLog.java:511) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:401) at org.apache.hadoop.hdfs.server.namenode.FSEditLog.loadFSEdits(FSEditLog.java:368) at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSEdits(FSImage.java:1172) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.doMerge(SecondaryNameNode.java:594) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.access$000(SecondaryNameNode.java:476) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doMerge(SecondaryNameNode.java:353) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:317) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:219) at java.lang.Thread.run(Thread.java:619) This happens even if I restart SNN over and over again. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1021) specify correct server principal for RefreshAuthorizationPolicyProtocol and RefreshUserToGroupMappingsProtocol protocols in DFSAdmin (for HADOOP-6612)
[ https://issues.apache.org/jira/browse/HDFS-1021?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik updated HDFS-1021: - Status: Resolved (was: Patch Available) Resolution: Fixed specify correct server principal for RefreshAuthorizationPolicyProtocol and RefreshUserToGroupMappingsProtocol protocols in DFSAdmin (for HADOOP-6612) -- Key: HDFS-1021 URL: https://issues.apache.org/jira/browse/HDFS-1021 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Boris Shkolnik Assignee: Boris Shkolnik Attachments: HDFS-1021.patch -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1150) Verify datanodes' identities to clients in secure clusters
[ https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-1150: -- Attachment: HDFS-1150-Y20S-ready-7.patch Final patch based on peer review from Jitendra and Devaraj. Fixed a couple javadoc issues and troublesome jdk reference. Tests look OK, test-patch has two false findbugs warnings from pre-existing errors that were refactored into new methods. Verify datanodes' identities to clients in secure clusters -- Key: HDFS-1150 URL: https://issues.apache.org/jira/browse/HDFS-1150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node Affects Versions: 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-1150-y20.build-script.patch, HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests
[ https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867731#action_12867731 ] dhruba borthakur commented on HDFS-599: --- In fact, the configuration deployed to clients do not need to contain the service port # at all. Then, no well-behaved client will access the service port. Improve Namenode robustness by prioritizing datanode heartbeats over client requests Key: HDFS-599 URL: https://issues.apache.org/jira/browse/HDFS-599 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-599.patch The namenode processes RPC requests from clients that are reading/writing to files as well as heartbeats/block reports from datanodes. Sometime, because of various reasons (Java GC runs, inconsistent performance of NFS filer that stores HDFS transacttion logs, etc), the namenode encounters transient slowness. For example, if the device that stores the HDFS transaction logs becomes sluggish, the Namenode's ability to process RPCs slows down to a certain extent. During this time, the RPCs from clients as well as the RPCs from datanodes suffer in similar fashion. If the underlying problem becomes worse, the NN's ability to process a heartbeat from a DN is severly impacted, thus causing the NN to declare that the DN is dead. Then the NN starts replicating blocks that used to reside on the now-declared-dead datanode. This adds extra load to the NN. Then the now-declared-datanode finally re-establishes contact with the NN, and sends a block report. The block report processing on the NN is another heavyweight activity, thus casing more load to the already overloaded namenode. My proposal is tha the NN should try its best to continue processing RPCs from datanodes and give lesser priority to serving client requests. The Datanode RPCs are integral to the consistency and performance of the Hadoop file system, and it is better to protect it at all costs. This will ensure that NN recovers from the hiccup much faster than what it does now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1150) Verify datanodes' identities to clients in secure clusters
[ https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Homan updated HDFS-1150: -- Attachment: HDFS-1150-Y20S-ready-8.patch Added license header to build.sh Verify datanodes' identities to clients in secure clusters -- Key: HDFS-1150 URL: https://issues.apache.org/jira/browse/HDFS-1150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node Affects Versions: 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-1150-y20.build-script.patch, HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters
[ https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867734#action_12867734 ] Allen Wittenauer commented on HDFS-1150: It appears the assumption is that the attacker won't be able to get root privileges. I don't think that's realistic for a complete fix and seems more of a security-through-obscurity approach. This is clearly a protocol problem and should be treated as such. Doing TCP port slight of hand isn't a real fix. Verify datanodes' identities to clients in secure clusters -- Key: HDFS-1150 URL: https://issues.apache.org/jira/browse/HDFS-1150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node Affects Versions: 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-1150-y20.build-script.patch, HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1094) Intelligent block placement policy to decrease probability of block loss
[ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravind Menon updated HDFS-1094: Attachment: prob.pdf Hi, We did some analysis of the data loss probability in HDFS under different block placement schemes and replication factors. We consider two simple placement schemes: 1. random placement, where each data block can be placed at random on any machine, and 2. in-rack placement, where all replicas of a block are placed in the same rack. The detailed analysis of these scenarios is covered in the attached pdf. The main observations from the analysis are: 1. Probability of data loss increases with cluster size 2. Probability of data loss is lower with in-rack placement than with random placement 3. Probability of data loss is lower with higher degree of replication Regards, Aravind Intelligent block placement policy to decrease probability of block loss Key: HDFS-1094 URL: https://issues.apache.org/jira/browse/HDFS-1094 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: prob.pdf The current HDFS implementation specifies that the first replica is local and the other two replicas are on any two random nodes on a random remote rack. This means that if any three datanodes die together, then there is a non-trivial probability of losing at least one block in the cluster. This JIRA is to discuss if there is a better algorithm that can lower probability of losing a block. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters
[ https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867736#action_12867736 ] Jakob Homan commented on HDFS-1150: --- bq. It appears the assumption is that the attacker won't be able to get root privileges. This is indeed an assumption we've had for all the security work. Should one get root, they can get krb keytabs and at that point, game's over. This approach doesn't fix that assumption, but is consistent with it. Verify datanodes' identities to clients in secure clusters -- Key: HDFS-1150 URL: https://issues.apache.org/jira/browse/HDFS-1150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node Affects Versions: 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-1150-y20.build-script.patch, HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters
[ https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867738#action_12867738 ] Allen Wittenauer commented on HDFS-1150: Also, - This patch will fail on Solaris and other operating systems that don't use GNU tar. It should use tar without the z option and pass to gzip to be more portable - it adds a requirement for wget as part of the build system which seems unnecessary. Shouldn't ivy or maven or one of these other fancy build tools be able to pull this code down? Verify datanodes' identities to clients in secure clusters -- Key: HDFS-1150 URL: https://issues.apache.org/jira/browse/HDFS-1150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node Affects Versions: 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-1150-y20.build-script.patch, HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel
[ https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytro Molkov updated HDFS-1071: Attachment: HDFS-1071.patch please have a look. The patch is pretty simple. The test copies the TestRestartDFS and adds the part where all images checksums are being compared. savenamespace should write the fsimage to all configured fs.name.dir in parallel Key: HDFS-1071 URL: https://issues.apache.org/jira/browse/HDFS-1071 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-1071.patch If you have a large number of files in HDFS, the fsimage file is very big. When the namenode restarts, it writes a copy of the fsimage to all directories configured in fs.name.dir. This takes a long time, especially if there are many directories in fs.name.dir. Make the NN write the fsimage to all these directories in parallel. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] sam rash updated HDFS-200: -- Attachment: checkLeases-fix-1.txt 1-line fix In HDFS, sync() not yet guarantees data available to the new readers Key: HDFS-200 URL: https://issues.apache.org/jira/browse/HDFS-200 Project: Hadoop HDFS Issue Type: New Feature Reporter: Tsz Wo (Nicholas), SZE Assignee: dhruba borthakur Priority: Blocker Attachments: 4379_20081010TC3.java, checkLeases-fix-1.txt, fsyncConcurrentReaders.txt, fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, fsyncConcurrentReaders15_20.txt, fsyncConcurrentReaders16_20.txt, fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, fsyncConcurrentReaders9.patch, hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, ReopenProblem.java, Writer.java, Writer.java In the append design doc (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it says * A reader is guaranteed to be able to read data that was 'flushed' before the reader opened the file However, this feature is not yet implemented. Note that the operation 'flushed' is now called sync. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1142) Lease recovery doesn't reassign lease when triggered by append()
[ https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867747#action_12867747 ] Konstantin Shvachko commented on HDFS-1142: --- Sorry, took me a while. The idea with lease recovery after soft limit expiration was that it is done under the same lease holder. Here is why. Expiration of the soft limit means that somebody else can claim the lease, and if he succeeds, then he is the new owner, if not, then not. So here several clients may compete for the same lease. They will call {{create()}} and get {{RecoveryInProgressException}} in response, which indicates that they should retry. The old client if still there can also compete for the lease. It has an advantage over other clients, because it does not need to go through the recovery process, but that seems fair. If you reassign the lease to {{HDFS_NameNode}}, then its timeouts will reset, see {{reassignLease()}}. And this will change the behavior. The clients trying to claim the file will be getting {{AlreadyBeingCreatedException}}, which means they cannot compete for the file anymore, and should fail. Suppose there is only one new client, and the old owner had died already. The client tries {{create()}}. This triggers lease recovery on NN, which starts the recovery under {{HDFS_NameNode}}, and throws {{RecoveryInProgressException}} back to the client. The client retries as expected, and the next time gets {{AlreadyBeingCreatedException}}. Thinking that somebody else got lucky before him the client bails out, which is not right as there is nobody esle competing for the file. Does that makes sense? I don't see a problem here. Do you have failing tests because of that? That by the way explains the parameter {{internalReleaseLease()}} - Introduction of {{NN_LEASE_RECOVERY_HOLDER}} constant definitely makes sense. - Persisting leases is not an issue if we do not reassign. - For future reference it is very undesirable to declare public methods in {{FSNamesystem}} to provide access to them from tests. The tests should either be in the right package or alternatively the {{FSNamesystem}} methods should be access via {{NameNodeAdapter}}, that's why it was introduced in the first place, see HDFS-563. Lease recovery doesn't reassign lease when triggered by append() Key: HDFS-1142 URL: https://issues.apache.org/jira/browse/HDFS-1142 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Attachments: hdfs-1142.txt, hdfs-1142.txt If a soft lease has expired and another writer calls append(), it triggers lease recovery but doesn't reassign the lease to a new owner. Therefore, the old writer can continue to allocate new blocks, try to steal back the lease, etc. This is for the testRecoveryOnBlockBoundary case of HDFS-1139 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867748#action_12867748 ] Todd Lipcon commented on HDFS-200: -- Hey Sam, any unit test for that last fix? Or description of what was wrong with it? In HDFS, sync() not yet guarantees data available to the new readers Key: HDFS-200 URL: https://issues.apache.org/jira/browse/HDFS-200 Project: Hadoop HDFS Issue Type: New Feature Reporter: Tsz Wo (Nicholas), SZE Assignee: dhruba borthakur Priority: Blocker Attachments: 4379_20081010TC3.java, checkLeases-fix-1.txt, fsyncConcurrentReaders.txt, fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, fsyncConcurrentReaders15_20.txt, fsyncConcurrentReaders16_20.txt, fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, fsyncConcurrentReaders9.patch, hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, ReopenProblem.java, Writer.java, Writer.java In the append design doc (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it says * A reader is guaranteed to be able to read data that was 'flushed' before the reader opened the file However, this feature is not yet implemented. Note that the operation 'flushed' is now called sync. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1142) Lease recovery doesn't reassign lease when triggered by append()
[ https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867750#action_12867750 ] Todd Lipcon commented on HDFS-1142: --- Hi Konstantin. Thanks for the detailed response. bq. Suppose there is only one new client, and the old owner had died already. The client tries create(). This triggers lease recovery on NN, which starts the recovery under HDFS_NameNode, and throws RecoveryInProgressException back to the client. The client retries as expected, and the next time gets AlreadyBeingCreatedException. Thinking that somebody else got lucky before him the client bails out, which is not right as there is nobody esle competing for the file. What if we specifically compare the holder to the HDFS_Namenode special value, and in this case throw RecoveryInProgressException instead of AlreadyBeingCreatedException? bq. Does that makes sense? I don't see a problem here. Do you have failing tests because of that? Yes - please see the new test case included in the patch above. The issue is that the client can continue to do things like completeFile or allocate new blocks while recovery is underway. bq. For future reference it is very undesirable to declare public methods in FSNamesystem to provide access to them from tests I agree. However, in order to do mockito spying on commitBlockSynchronization, using a trampoline class like NameNodeAdapter would not work. If you agree with my above points, I can see if I can move the spy call into NameNodeAdapter itself. BTW, isn't this the point of the Private InterfaceAudience annotation? Let me know if you agree with the above idea (throwing RecoveryInProgressException when the lease is held by HDFS_NameNode). Lease recovery doesn't reassign lease when triggered by append() Key: HDFS-1142 URL: https://issues.apache.org/jira/browse/HDFS-1142 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Attachments: hdfs-1142.txt, hdfs-1142.txt If a soft lease has expired and another writer calls append(), it triggers lease recovery but doesn't reassign the lease to a new owner. Therefore, the old writer can continue to allocate new blocks, try to steal back the lease, etc. This is for the testRecoveryOnBlockBoundary case of HDFS-1139 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1150) Verify datanodes' identities to clients in secure clusters
[ https://issues.apache.org/jira/browse/HDFS-1150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867745#action_12867745 ] Allen Wittenauer commented on HDFS-1150: More problems: - hadoop shell script doesn't check to see if jsvc is present and executable - hadoop shell script is hard-coded to launch the datanode process as user 'hdfs' - hadoop shell script is hard-coded to use /dev/stderr and /dev/stdout which seems like a bad idea if something prevents the jsvc code from working properly I'm really confused as to how this is actually supposed to work in real world usage: - It appears the intention is that the hadoop command is going to be run by root based upon checking $EUID. This means a lot more is getting executed as root than just the java process. Why aren't we just re-using the mapred setuid code to launch the datanode process rather than having this dependency? - This doesn't look like it will work with start-mapred.sh/start-all.sh unless those are also run as root. Verify datanodes' identities to clients in secure clusters -- Key: HDFS-1150 URL: https://issues.apache.org/jira/browse/HDFS-1150 Project: Hadoop HDFS Issue Type: New Feature Components: data-node Affects Versions: 0.22.0 Reporter: Jakob Homan Assignee: Jakob Homan Attachments: HDFS-1150-y20.build-script.patch, HDFS-1150-Y20S-ready-5.patch, HDFS-1150-Y20S-ready-6.patch, HDFS-1150-Y20S-ready-7.patch, HDFS-1150-Y20S-ready-8.patch, HDFS-1150-Y20S-Rough-2.patch, HDFS-1150-Y20S-Rough-3.patch, HDFS-1150-Y20S-Rough-4.patch, HDFS-1150-Y20S-Rough.txt Currently we use block access tokens to allow datanodes to verify clients' identities, however we don't have a way for clients to verify the authenticity of the datanodes themselves. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-599) Improve Namenode robustness by prioritizing datanode heartbeats over client requests
[ https://issues.apache.org/jira/browse/HDFS-599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867754#action_12867754 ] Hairong Kuang commented on HDFS-599: This of course doesn't help solve the problem of malicious clients still accessing the service port by hacking the values in the code. I am not talking about a malicious client. What if there is a mis-configured client happens to choose the service port as its client port? removing the ClientProtocol from the service port will effectively make it impossible for administrator to perform any client operations like LS, or even getting out of safemode You should break the current ClientProtocol into AdminProtocol and the real ClientProtocol. Improve Namenode robustness by prioritizing datanode heartbeats over client requests Key: HDFS-599 URL: https://issues.apache.org/jira/browse/HDFS-599 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-599.patch The namenode processes RPC requests from clients that are reading/writing to files as well as heartbeats/block reports from datanodes. Sometime, because of various reasons (Java GC runs, inconsistent performance of NFS filer that stores HDFS transacttion logs, etc), the namenode encounters transient slowness. For example, if the device that stores the HDFS transaction logs becomes sluggish, the Namenode's ability to process RPCs slows down to a certain extent. During this time, the RPCs from clients as well as the RPCs from datanodes suffer in similar fashion. If the underlying problem becomes worse, the NN's ability to process a heartbeat from a DN is severly impacted, thus causing the NN to declare that the DN is dead. Then the NN starts replicating blocks that used to reside on the now-declared-dead datanode. This adds extra load to the NN. Then the now-declared-datanode finally re-establishes contact with the NN, and sends a block report. The block report processing on the NN is another heavyweight activity, thus casing more load to the already overloaded namenode. My proposal is tha the NN should try its best to continue processing RPCs from datanodes and give lesser priority to serving client requests. The Datanode RPCs are integral to the consistency and performance of the Hadoop file system, and it is better to protect it at all costs. This will ensure that NN recovers from the hiccup much faster than what it does now. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1019) Incorrect default values for delegation tokens in hdfs-default.xml
[ https://issues.apache.org/jira/browse/HDFS-1019?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867755#action_12867755 ] Hadoop QA commented on HDFS-1019: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12443334/HDFS-1019.3.patch against trunk revision 944401. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. -1 release audit. The applied patch generated 113 release audit warnings (more than the trunk's current 112 warnings). -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/363/testReport/ Release audit warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/363/artifact/trunk/patchprocess/releaseAuditDiffWarnings.txt Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/363/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/363/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/363/console This message is automatically generated. Incorrect default values for delegation tokens in hdfs-default.xml -- Key: HDFS-1019 URL: https://issues.apache.org/jira/browse/HDFS-1019 Project: Hadoop HDFS Issue Type: Bug Reporter: Jitendra Nath Pandey Assignee: Jitendra Nath Pandey Attachments: HDFS-1019-y20.1.patch, HDFS-1019.1.patch, HDFS-1019.2.patch, HDFS-1019.3.patch The default values for delegation token parameters in hdfs-default.xml are incorrect. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867762#action_12867762 ] sam rash commented on HDFS-200: --- sorry, should be easy to make a unit test, but i didn't have time. the case is that any lease that expires and has more than one path associated with it will throw exceptions. the leaseManager calls internalRelease(lease, src) and expects only the lease for src to be removed by the call (hence it does its own loop on all the paths for a lease). However, with hdfs-200, this function removes *all* paths for the lease, not just the specified one. internalReleaseLeaseOne does just the single path. if I get a minute, I can make such a test case quickly and update the patch w/it In HDFS, sync() not yet guarantees data available to the new readers Key: HDFS-200 URL: https://issues.apache.org/jira/browse/HDFS-200 Project: Hadoop HDFS Issue Type: New Feature Reporter: Tsz Wo (Nicholas), SZE Assignee: dhruba borthakur Priority: Blocker Attachments: 4379_20081010TC3.java, checkLeases-fix-1.txt, fsyncConcurrentReaders.txt, fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, fsyncConcurrentReaders15_20.txt, fsyncConcurrentReaders16_20.txt, fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, fsyncConcurrentReaders9.patch, hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, ReopenProblem.java, Writer.java, Writer.java In the append design doc (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it says * A reader is guaranteed to be able to read data that was 'flushed' before the reader opened the file However, this feature is not yet implemented. Note that the operation 'flushed' is now called sync. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867766#action_12867766 ] Todd Lipcon commented on HDFS-200: -- cool, I agree with your assessment. Unit test should be good whenever you have a chance, thanks! In HDFS, sync() not yet guarantees data available to the new readers Key: HDFS-200 URL: https://issues.apache.org/jira/browse/HDFS-200 Project: Hadoop HDFS Issue Type: New Feature Reporter: Tsz Wo (Nicholas), SZE Assignee: dhruba borthakur Priority: Blocker Attachments: 4379_20081010TC3.java, checkLeases-fix-1.txt, fsyncConcurrentReaders.txt, fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, fsyncConcurrentReaders15_20.txt, fsyncConcurrentReaders16_20.txt, fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, fsyncConcurrentReaders9.patch, hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, ReopenProblem.java, Writer.java, Writer.java In the append design doc (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it says * A reader is guaranteed to be able to read data that was 'flushed' before the reader opened the file However, this feature is not yet implemented. Note that the operation 'flushed' is now called sync. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1142) Lease recovery doesn't reassign lease when triggered by append()
[ https://issues.apache.org/jira/browse/HDFS-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867772#action_12867772 ] Konstantin Shvachko commented on HDFS-1142: --- Comparing the holder with HDFS_Namenode will work, but current logic should work too, so why change it. I suspect this is more like a problem of completeFile() and the new block allocation, than the lease recovery's. But let me look deeper into the test cases. Private InterfaceAudience annotation will not prevent people from using the methods. Lease recovery doesn't reassign lease when triggered by append() Key: HDFS-1142 URL: https://issues.apache.org/jira/browse/HDFS-1142 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Blocker Attachments: hdfs-1142.txt, hdfs-1142.txt If a soft lease has expired and another writer calls append(), it triggers lease recovery but doesn't reassign the lease to a new owner. Therefore, the old writer can continue to allocate new blocks, try to steal back the lease, etc. This is for the testRecoveryOnBlockBoundary case of HDFS-1139 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1071) savenamespace should write the fsimage to all configured fs.name.dir in parallel
[ https://issues.apache.org/jira/browse/HDFS-1071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867778#action_12867778 ] Todd Lipcon commented on HDFS-1071: --- Few small notes: - Please change the Threads to be named - eg FSImageSaver for /path/to/dir - I'm not sure if the behavior under InterruptedException is right - don't we want to retry joining on that thread? Or perhaps interrupt those threads themselves? I'm worried about leaving a straggling thread saving the namespace during a ^C shutdown, for example. - The code to join on all the threads in the list is repeated a lot - maybe factor into a static method? savenamespace should write the fsimage to all configured fs.name.dir in parallel Key: HDFS-1071 URL: https://issues.apache.org/jira/browse/HDFS-1071 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: Dmytro Molkov Attachments: HDFS-1071.patch If you have a large number of files in HDFS, the fsimage file is very big. When the namenode restarts, it writes a copy of the fsimage to all directories configured in fs.name.dir. This takes a long time, especially if there are many directories in fs.name.dir. Make the NN write the fsimage to all these directories in parallel. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1094) Intelligent block placement policy to decrease probability of block loss
[ https://issues.apache.org/jira/browse/HDFS-1094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aravind Menon updated HDFS-1094: Attachment: prob.pdf Fixed a minor typo in the pdf. Aravind Intelligent block placement policy to decrease probability of block loss Key: HDFS-1094 URL: https://issues.apache.org/jira/browse/HDFS-1094 Project: Hadoop HDFS Issue Type: Improvement Components: name-node Reporter: dhruba borthakur Assignee: dhruba borthakur Attachments: prob.pdf, prob.pdf The current HDFS implementation specifies that the first replica is local and the other two replicas are on any two random nodes on a random remote rack. This means that if any three datanodes die together, then there is a non-trivial probability of losing at least one block in the cluster. This JIRA is to discuss if there is a better algorithm that can lower probability of losing a block. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-970) FSImage writing should always fsync before close
[ https://issues.apache.org/jira/browse/HDFS-970?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-970: -- Status: Resolved (was: Patch Available) Hadoop Flags: [Reviewed] Fix Version/s: 0.22.0 Resolution: Fixed I just committed this, thanks Todd. FSImage writing should always fsync before close Key: HDFS-970 URL: https://issues.apache.org/jira/browse/HDFS-970 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.20.1, 0.21.0, 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Priority: Critical Fix For: 0.22.0 Attachments: hdfs-970.txt Without an fsync, it's common that filesystems will delay the writing of metadata to the journal until all of the data blocks have been flushed. If the system crashes while the dirty pages haven't been flushed, the file is left in an indeterminate state. In some FSs (eg ext4) this will result in a 0-length file. In others (eg XFS) it will result in the correct length but any number of data blocks getting zeroed. Calling FileChannel.force before closing the FSImage prevents this issue. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1130) Pass Administrator acl to HTTPServer for common servlet access.
[ https://issues.apache.org/jira/browse/HDFS-1130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HDFS-1130: -- Attachment: hdfs-1130.3.patch Updated patch for Y20S Pass Administrator acl to HTTPServer for common servlet access. --- Key: HDFS-1130 URL: https://issues.apache.org/jira/browse/HDFS-1130 Project: Hadoop HDFS Issue Type: Bug Components: security Reporter: Amareshwari Sriramadasu Fix For: 0.22.0 Attachments: hdfs-1130.3.patch, hdfs-1130.patch Once HADOOP-6748 is done, HDFS should pass administrator acl when HTTPServer is constructed. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-528) Add ability for safemode to wait for a minimum number of live datanodes
[ https://issues.apache.org/jira/browse/HDFS-528?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-528: -- Status: Open (was: Patch Available) It appears that this functionality can be achieved by some code outside the namenode. 1. start the NN with a dfs.safemode.threshold.pct 1.5 i.e. NN will never exit safemode by itself. 2. write a script that periodically invokes bin/hadoop dfsadmin -report and counts the number of datanodes that have checked in with the NN. 3. The script can explicitly exit safemode whenever it desires. This approach allows different policies of when-to-exit-safemode be implemented outside the NN. If you agree, then we can make this JIRA expose a new API from the NN that exposes the safeBlockCount and totalBlockCount from the NN. Add ability for safemode to wait for a minimum number of live datanodes --- Key: HDFS-528 URL: https://issues.apache.org/jira/browse/HDFS-528 Project: Hadoop HDFS Issue Type: New Feature Components: scripts Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-528-v2.txt, hdfs-528-v3.txt, hdfs-528.txt, hdfs-528.txt When starting up a fresh cluster programatically, users often want to wait until DFS is writable before continuing in a script. dfsadmin -safemode wait doesn't quite work for this on a completely fresh cluster, since when there are 0 blocks on the system, 100% of them are accounted for before any DNs have reported. This JIRA is to add a command which waits until a certain number of DNs have reported as alive to the NN. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-528) Add ability for safemode to wait for a minimum number of live datanodes
[ https://issues.apache.org/jira/browse/HDFS-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867799#action_12867799 ] Todd Lipcon commented on HDFS-528: -- I agree, but I don't think we should require operators to implement all of these special tools. The feature here is a pretty common request, I think. It's especially helpful for new users whose DN is broken - much better to have them get SafemodeException: 0 datanodes reported kind of thing compared to the bizarre looking Couldn't find any datanode for block kind of errors they get now (if we were to default this to 1). Add ability for safemode to wait for a minimum number of live datanodes --- Key: HDFS-528 URL: https://issues.apache.org/jira/browse/HDFS-528 Project: Hadoop HDFS Issue Type: New Feature Components: scripts Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-528-v2.txt, hdfs-528-v3.txt, hdfs-528.txt, hdfs-528.txt When starting up a fresh cluster programatically, users often want to wait until DFS is writable before continuing in a script. dfsadmin -safemode wait doesn't quite work for this on a completely fresh cluster, since when there are 0 blocks on the system, 100% of them are accounted for before any DNs have reported. This JIRA is to add a command which waits until a certain number of DNs have reported as alive to the NN. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-417) Improvements to Hadoop Thrift bindings
[ https://issues.apache.org/jira/browse/HDFS-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-417: -- Status: Open (was: Patch Available) I am canceling this patch because it is sadly out of date and we do not yet have consensus on what to change. Please resurrect this discussion if you so desire. Thanks. Improvements to Hadoop Thrift bindings -- Key: HDFS-417 URL: https://issues.apache.org/jira/browse/HDFS-417 Project: Hadoop HDFS Issue Type: Bug Components: contrib/thriftfs Environment: Tested under Linux x86-64 Reporter: Carlos Valiente Assignee: Todd Lipcon Priority: Minor Attachments: all.diff, BlockManager.java, build_xml.diff, DefaultBlockManager.java, DFSBlockManager.java, gen.diff, hadoop-4707-31c331.patch.gz, HADOOP-4707-55c046a.txt, hadoop-4707-6bc958.txt, hadoop-4707-867f26.txt.gz, HADOOP-4707.diff, HADOOP-4707.patch, HADOOP-4707.patch, hadoopfs_thrift.diff, hadoopthriftapi.jar, HadoopThriftServer.java, HadoopThriftServer_java.diff, hdfs.py, hdfs_py_venky.diff, libthrift.jar, libthrift.jar, libthrift.jar, libthrift.jar I have made the following changes to hadoopfs.thrift: # Added namespaces for Python, Perl and C++. # Renamed parameters and struct members to camelCase versions to keep them consistent (in particular FileStatus{blockReplication,blockSize} vs FileStatus.{block_replication,blocksize}). # Renamed ThriftHadoopFileSystem to FileSystem. From the perspective of a Perl/Python/C++ user, 1) it is already clear that we're using Thrift, and 2) the fact that we're dealing with Hadoop is already explicit in the namespace. The usage of generated code is more compact and (in my opinion) clearer: {quote} *Perl*: use HadoopFS; my $client = HadoopFS::FileSystemClient-new(..); _instead of:_ my $client = HadoopFS::ThriftHadoopFileSystemClient-new(..); *Python*: from hadoopfs import FileSystem client = FileSystem.Client(..) _instead of_ from hadoopfs import ThriftHadoopFileSystem client = ThriftHadoopFileSystem.Client(..) (See also the attached diff [^scripts_hdfs_py.diff] for the new version of 'scripts/hdfs.py'). *C++*: hadoopfs::FileSystemClient client(..); _instead of_: hadoopfs::ThriftHadoopFileSystemClient client(..); {quote} # Renamed ThriftHandle to FileHandle: As in 3, it is clear that we're dealing with a Thrift object, and its purpose (to act as a handle for file operations) is clearer. # Renamed ThriftIOException to IOException, to keep it simpler, and consistent with MalformedInputException. # Added explicit version tags to fields of ThriftHandle/FileHandle, Pathname, MalformedInputException and ThriftIOException/IOException, to improve compatibility of existing clients with future versions of the interface which might add new fields to those objects (like stack traces for the exception types, for instance). Those changes are reflected in the attachment [^hadoopfs_thrift.diff]. Changes in generated Java, Python, Perl and C++ code are also attached in [^gen.diff]. They were generated by a Thrift checkout from trunk ([http://svn.apache.org/repos/asf/incubator/thrift/trunk/]) as of revision 719697, plus the following Perl-related patches: * [https://issues.apache.org/jira/browse/THRIFT-190] * [https://issues.apache.org/jira/browse/THRIFT-193] * [https://issues.apache.org/jira/browse/THRIFT-199] The Thrift jar file [^libthrift.jar] built from that Thrift checkout is also attached, since it's needed to run the Java Thrift server. I have also added a new target to src/contrib/thriftfs/build.xml to build the Java bindings needed for org.apache.hadoop.thriftfs.HadoopThriftServer.java (see attachment [^build_xml.diff] and modified HadoopThriftServer.java to make use of the new bindings (see attachment [^HadoopThriftServer_java.diff]). The jar file [^lib/hadoopthriftapi.jar] is also included, although it can be regenerated from the stuff under 'gen-java' and the new 'compile-gen' Ant target. The whole changeset is also included as [^all.diff]. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-417) Improvements to Hadoop Thrift bindings
[ https://issues.apache.org/jira/browse/HDFS-417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-417. -- Resolution: Won't Fix Yes, I agree - it seems the demand in the community isn't high for this at the moment, and Avro is coming down the pipe anyway. Resolving wontfix. Improvements to Hadoop Thrift bindings -- Key: HDFS-417 URL: https://issues.apache.org/jira/browse/HDFS-417 Project: Hadoop HDFS Issue Type: Bug Components: contrib/thriftfs Environment: Tested under Linux x86-64 Reporter: Carlos Valiente Assignee: Todd Lipcon Priority: Minor Attachments: all.diff, BlockManager.java, build_xml.diff, DefaultBlockManager.java, DFSBlockManager.java, gen.diff, hadoop-4707-31c331.patch.gz, HADOOP-4707-55c046a.txt, hadoop-4707-6bc958.txt, hadoop-4707-867f26.txt.gz, HADOOP-4707.diff, HADOOP-4707.patch, HADOOP-4707.patch, hadoopfs_thrift.diff, hadoopthriftapi.jar, HadoopThriftServer.java, HadoopThriftServer_java.diff, hdfs.py, hdfs_py_venky.diff, libthrift.jar, libthrift.jar, libthrift.jar, libthrift.jar I have made the following changes to hadoopfs.thrift: # Added namespaces for Python, Perl and C++. # Renamed parameters and struct members to camelCase versions to keep them consistent (in particular FileStatus{blockReplication,blockSize} vs FileStatus.{block_replication,blocksize}). # Renamed ThriftHadoopFileSystem to FileSystem. From the perspective of a Perl/Python/C++ user, 1) it is already clear that we're using Thrift, and 2) the fact that we're dealing with Hadoop is already explicit in the namespace. The usage of generated code is more compact and (in my opinion) clearer: {quote} *Perl*: use HadoopFS; my $client = HadoopFS::FileSystemClient-new(..); _instead of:_ my $client = HadoopFS::ThriftHadoopFileSystemClient-new(..); *Python*: from hadoopfs import FileSystem client = FileSystem.Client(..) _instead of_ from hadoopfs import ThriftHadoopFileSystem client = ThriftHadoopFileSystem.Client(..) (See also the attached diff [^scripts_hdfs_py.diff] for the new version of 'scripts/hdfs.py'). *C++*: hadoopfs::FileSystemClient client(..); _instead of_: hadoopfs::ThriftHadoopFileSystemClient client(..); {quote} # Renamed ThriftHandle to FileHandle: As in 3, it is clear that we're dealing with a Thrift object, and its purpose (to act as a handle for file operations) is clearer. # Renamed ThriftIOException to IOException, to keep it simpler, and consistent with MalformedInputException. # Added explicit version tags to fields of ThriftHandle/FileHandle, Pathname, MalformedInputException and ThriftIOException/IOException, to improve compatibility of existing clients with future versions of the interface which might add new fields to those objects (like stack traces for the exception types, for instance). Those changes are reflected in the attachment [^hadoopfs_thrift.diff]. Changes in generated Java, Python, Perl and C++ code are also attached in [^gen.diff]. They were generated by a Thrift checkout from trunk ([http://svn.apache.org/repos/asf/incubator/thrift/trunk/]) as of revision 719697, plus the following Perl-related patches: * [https://issues.apache.org/jira/browse/THRIFT-190] * [https://issues.apache.org/jira/browse/THRIFT-193] * [https://issues.apache.org/jira/browse/THRIFT-199] The Thrift jar file [^libthrift.jar] built from that Thrift checkout is also attached, since it's needed to run the Java Thrift server. I have also added a new target to src/contrib/thriftfs/build.xml to build the Java bindings needed for org.apache.hadoop.thriftfs.HadoopThriftServer.java (see attachment [^build_xml.diff] and modified HadoopThriftServer.java to make use of the new bindings (see attachment [^HadoopThriftServer_java.diff]). The jar file [^lib/hadoopthriftapi.jar] is also included, although it can be regenerated from the stuff under 'gen-java' and the new 'compile-gen' Ant target. The whole changeset is also included as [^all.diff]. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Resolved: (HDFS-419) Add Thrift interface to JobTracker/TaskTracker
[ https://issues.apache.org/jira/browse/HDFS-419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon resolved HDFS-419. -- Resolution: Won't Fix Add Thrift interface to JobTracker/TaskTracker -- Key: HDFS-419 URL: https://issues.apache.org/jira/browse/HDFS-419 Project: Hadoop HDFS Issue Type: New Feature Components: contrib/thriftfs Reporter: Todd Lipcon Assignee: Todd Lipcon We currently have Thrift interfaces for accessing the NN and the DFS, but no access to the Mapred system. I'm currently working on instrumenting the JT with a Thrift plugin. If anyone has any thoughts in this area, please comment on this JIRA. Open questions: * Is job submission a practical goal to accomplish via Thrift? My thought is that this might be a goal for a second JIRA after basic monitoring/reporting is working. * Does this belong in contrib/thriftfs? I propose renaming contrib/thriftfs to contrib/thrift, as it will no longer be FS-specific. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-528) Add ability for safemode to wait for a minimum number of live datanodes
[ https://issues.apache.org/jira/browse/HDFS-528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12867802#action_12867802 ] dhruba borthakur commented on HDFS-528: --- If it is a question of reporting error to the user, how about if we change the error message: if (#nodes in the cluster == 0) There are no datanodes in the entire cluster else Couldn't find any datanode for block Add ability for safemode to wait for a minimum number of live datanodes --- Key: HDFS-528 URL: https://issues.apache.org/jira/browse/HDFS-528 Project: Hadoop HDFS Issue Type: New Feature Components: scripts Affects Versions: 0.22.0 Reporter: Todd Lipcon Assignee: Todd Lipcon Attachments: hdfs-528-v2.txt, hdfs-528-v3.txt, hdfs-528.txt, hdfs-528.txt When starting up a fresh cluster programatically, users often want to wait until DFS is writable before continuing in a script. dfsadmin -safemode wait doesn't quite work for this on a completely fresh cluster, since when there are 0 blocks on the system, 100% of them are accounted for before any DNs have reported. This JIRA is to add a command which waits until a certain number of DNs have reported as alive to the NN. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.