[jira] Commented: (HDFS-611) Heartbeats times from Datanodes increase when there are plenty of blocks to delete
[ https://issues.apache.org/jira/browse/HDFS-611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754040#action_12754040 ] dhruba borthakur commented on HDFS-611: --- > The datanode might remove blocks immediately from its in-memory lists that > are used to generate block reports but the background thread could remove the > actual block files that will work too. will submit a patch soon. > Heartbeats times from Datanodes increase when there are plenty of blocks to > delete > -- > > Key: HDFS-611 > URL: https://issues.apache.org/jira/browse/HDFS-611 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Reporter: dhruba borthakur >Assignee: dhruba borthakur > > I am seeing that when we delete a large directory that has plenty of blocks, > the heartbeat times from datanodes increase significantly from the normal > value of 3 seconds to as large as 50 seconds or so. The heartbeat thread in > the Datanode deletes a bunch of blocks sequentially, this causes the > heartbeat times to increase. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754043#action_12754043 ] dhruba borthakur commented on HDFS-200: --- @ryan, stack: thanks for the info. I will follow it up very soon. will keep you posted. > In HDFS, sync() not yet guarantees data available to the new readers > > > Key: HDFS-200 > URL: https://issues.apache.org/jira/browse/HDFS-200 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Tsz Wo (Nicholas), SZE >Assignee: dhruba borthakur >Priority: Blocker > Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, > fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, > fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, > fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, > fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, > fsyncConcurrentReaders9.patch, > hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, > hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, > namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, > ReopenProblem.java, Writer.java, Writer.java > > > In the append design doc > (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it > says > * A reader is guaranteed to be able to read data that was 'flushed' before > the reader opened the file > However, this feature is not yet implemented. Note that the operation > 'flushed' is now called "sync". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754052#action_12754052 ] dhruba borthakur commented on HDFS-200: --- I think this is the scenario that you are facing: 1. The file is written for the first time and was not closed. The writer closed the file but at this time only one of the three replicas have checked in with the namenode. 2. The new writer invoked append() to write more data into the file. The new writer found the one remaining replica of the block, stamped a new generation stamp for this block, made it ready to receive new data for this file and lease recovery is successful. The stamping of the new generation stamp essentially invaidated the other two replicas of this blockthis block now has only one valid replica. The namenode won't start replicating this block till the block is full. If this sole datanode now goes down, then the file will be "missing a block". This is what you folks encountered. One option is to set dfs.replication.min to 2. This will ensure that closing a file (step 1) will be successful only when at least two replcias of the block have checked in with the namenode. This should reduce the probability of this problem occuring. Another option is to set the replication factor of the hbase log file(s) to be greater than 3. > In HDFS, sync() not yet guarantees data available to the new readers > > > Key: HDFS-200 > URL: https://issues.apache.org/jira/browse/HDFS-200 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Tsz Wo (Nicholas), SZE >Assignee: dhruba borthakur >Priority: Blocker > Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, > fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, > fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, > fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, > fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, > fsyncConcurrentReaders9.patch, > hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, > hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, > namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, > ReopenProblem.java, Writer.java, Writer.java > > > In the append design doc > (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it > says > * A reader is guaranteed to be able to read data that was 'flushed' before > the reader opened the file > However, this feature is not yet implemented. Note that the operation > 'flushed' is now called "sync". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Issue Comment Edited: (HDFS-200) In HDFS, sync() not yet guarantees data available to the new readers
[ https://issues.apache.org/jira/browse/HDFS-200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754052#action_12754052 ] dhruba borthakur edited comment on HDFS-200 at 9/11/09 1:54 AM: I think this is the scenario that you are facing: 1. The file is written for the first time. The writer closed the file but at this time only one of the three replicas have checked in with the namenode. 2. The new writer invoked append() to write more data into the file. The new writer found the one remaining replica of the block, stamped a new generation stamp on this replica, made it ready to receive new data for this file and file is not open for "append". The stamping of the new generation stamp essentially invaidated the other two replicas of this blockthis block now has only one valid replica. The namenode won't start replicating this block till the block is full. If this sole datanode now dies, then the file will be "missing a block". This is what you folks encountered. One option is to set dfs.replication.min to 2. This will ensure that closing a file (step 1) will be successful only when at least two replicas of the block have checked in with the namenode. This should reduce the probability of this problem occuring. Along with that you could set the replication factor of the hbase log file(s) to be greater than 3. was (Author: dhruba): I think this is the scenario that you are facing: 1. The file is written for the first time and was not closed. The writer closed the file but at this time only one of the three replicas have checked in with the namenode. 2. The new writer invoked append() to write more data into the file. The new writer found the one remaining replica of the block, stamped a new generation stamp for this block, made it ready to receive new data for this file and lease recovery is successful. The stamping of the new generation stamp essentially invaidated the other two replicas of this blockthis block now has only one valid replica. The namenode won't start replicating this block till the block is full. If this sole datanode now goes down, then the file will be "missing a block". This is what you folks encountered. One option is to set dfs.replication.min to 2. This will ensure that closing a file (step 1) will be successful only when at least two replcias of the block have checked in with the namenode. This should reduce the probability of this problem occuring. Another option is to set the replication factor of the hbase log file(s) to be greater than 3. > In HDFS, sync() not yet guarantees data available to the new readers > > > Key: HDFS-200 > URL: https://issues.apache.org/jira/browse/HDFS-200 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Tsz Wo (Nicholas), SZE >Assignee: dhruba borthakur >Priority: Blocker > Attachments: 4379_20081010TC3.java, fsyncConcurrentReaders.txt, > fsyncConcurrentReaders11_20.txt, fsyncConcurrentReaders12_20.txt, > fsyncConcurrentReaders13_20.txt, fsyncConcurrentReaders14_20.txt, > fsyncConcurrentReaders3.patch, fsyncConcurrentReaders4.patch, > fsyncConcurrentReaders5.txt, fsyncConcurrentReaders6.patch, > fsyncConcurrentReaders9.patch, > hadoop-stack-namenode-aa0-000-12.u.powerset.com.log.gz, > hdfs-200-ryan-existing-file-fail.txt, hypertable-namenode.log.gz, > namenode.log, namenode.log, Reader.java, Reader.java, reopen_test.sh, > ReopenProblem.java, Writer.java, Writer.java > > > In the append design doc > (https://issues.apache.org/jira/secure/attachment/12370562/Appends.doc), it > says > * A reader is guaranteed to be able to read data that was 'flushed' before > the reader opened the file > However, this feature is not yet implemented. Note that the operation > 'flushed' is now called "sync". -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-550) DataNode restarts may introduce corrupt/duplicated/lost replicas when handling detached replicas
[ https://issues.apache.org/jira/browse/HDFS-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754214#action_12754214 ] Hairong Kuang commented on HDFS-550: Dhruba, does my last comment make sense? I see we have two options for handling temporary detached blocks at DataNode startup time: 1. Do not recover them. I would propose to put the temporary detached blocks in the "tmp" directory so they get automatically deleted when DataNode starts. 2. Recover them. Then the temporary .detach files have to be in a snapshot and need a way to figure out which directory the original files are located. I think the proposal I put in this jira should work. Let me know which option you prefer. > DataNode restarts may introduce corrupt/duplicated/lost replicas when > handling detached replicas > > > Key: HDFS-550 > URL: https://issues.apache.org/jira/browse/HDFS-550 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node >Affects Versions: 0.21.0 >Reporter: Hairong Kuang >Assignee: Hairong Kuang >Priority: Blocker > Fix For: Append Branch > > Attachments: detach.patch > > > Current trunk first calls detach to unlinks a finalized replica before > appending to this block. Unlink is done by temporally copying the block file > in the "current" subtree to a directory called "detach" under the volume's > daa directory and then copies it back when unlink succeeds. On datanode > restarts, datanodes recover faied unlink by copying replicas under "detach" > to "current". > There are two bugs with this implementation: > 1. The "detach" directory does not include in a snapshot. so rollback will > cause the "detaching" replicas to be lost. > 2. After a replica is copied to the "detach" directory, the information of > its original location is lost. The current implementation erroneously assumes > that the replica to be unlinked is under "current". This will make two > instances of replicas with the same block id to coexist in a datanode. Also > if a replica under "detach" is corrupt, the corrupt replica is moved to > "current" without being detected, polluting datanode data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-602) Atempt to make a directory under an existing file on DistributedFileSystem should throw an FileAlreadyExistsException instead of FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-602: - Resolution: Fixed Fix Version/s: 0.21.0 Release Note: DistributedFileSystem mkdirs throws FileAlreadyExistsException instead of FileNotFoundException. Hadoop Flags: [Incompatible change, Reviewed] (was: [Incompatible change]) Status: Resolved (was: Patch Available) I just committed this. Thank you Boris. > Atempt to make a directory under an existing file on DistributedFileSystem > should throw an FileAlreadyExistsException instead of FileNotFoundException > -- > > Key: HDFS-602 > URL: https://issues.apache.org/jira/browse/HDFS-602 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Fix For: 0.21.0 > > Attachments: HDFS-602.patch > > > Atempt to make a directory under an existing file on DistributedFileSystem > should throw an FileAlreadyExistsException instead of FileNotFoundException. > Also we should unwrap this exception from RemoteException -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-602) Atempt to make a directory under an existing file on DistributedFileSystem should throw an FileAlreadyExistsException instead of FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-602: - Component/s: name-node hdfs client > Atempt to make a directory under an existing file on DistributedFileSystem > should throw an FileAlreadyExistsException instead of FileNotFoundException > -- > > Key: HDFS-602 > URL: https://issues.apache.org/jira/browse/HDFS-602 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client, name-node >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Fix For: 0.21.0 > > Attachments: HDFS-602.patch > > > Atempt to make a directory under an existing file on DistributedFileSystem > should throw an FileAlreadyExistsException instead of FileNotFoundException. > Also we should unwrap this exception from RemoteException -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-587) Test programs support only default queue.
[ https://issues.apache.org/jira/browse/HDFS-587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Steffl reassigned HDFS-587: Assignee: Erik Steffl > Test programs support only default queue. > - > > Key: HDFS-587 > URL: https://issues.apache.org/jira/browse/HDFS-587 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Sreekanth Ramakrishnan >Assignee: Erik Steffl > > Following test programs always run on "default" queue even when other queues > are passed as job parameter. > DFSCIOTest > DistributedFSCheck > TestDFSIO > Filebench > Loadgen > Nnbench -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-602) Atempt to make a directory under an existing file on DistributedFileSystem should throw an FileAlreadyExistsException instead of FileNotFoundException
[ https://issues.apache.org/jira/browse/HDFS-602?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754274#action_12754274 ] Hudson commented on HDFS-602: - Integrated in Hadoop-Hdfs-trunk-Commit #28 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/28/]) . DistributedFileSystem mkdirs command throws FileAlreadyExistsException instead of FileNotFoundException. Contributed by Boris Shkolnik. > Atempt to make a directory under an existing file on DistributedFileSystem > should throw an FileAlreadyExistsException instead of FileNotFoundException > -- > > Key: HDFS-602 > URL: https://issues.apache.org/jira/browse/HDFS-602 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client, name-node >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Fix For: 0.21.0 > > Attachments: HDFS-602.patch > > > Atempt to make a directory under an existing file on DistributedFileSystem > should throw an FileAlreadyExistsException instead of FileNotFoundException. > Also we should unwrap this exception from RemoteException -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-592) Allow client to get a new generation stamp from NameNode
[ https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-592: --- Component/s: (was: hdfs client) name-node Description: This issue aims to add an API to ClientProtocol that fetches a new generation stamp and an access token from NameNode to support append or pipeline recovery. (was: This issue aims to 1. add nextGenerationStamp API to ClientProtocol that fetches a new generation stamp and an access token from NameNode; 2. change append so it additonally returns a new generation stamp and an access token.) > Allow client to get a new generation stamp from NameNode > > > Key: HDFS-592 > URL: https://issues.apache.org/jira/browse/HDFS-592 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: Append Branch >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: Append Branch > > > This issue aims to add an API to ClientProtocol that fetches a new > generation stamp and an access token from NameNode to support append or > pipeline recovery. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-592) Allow client to get a new generation stamp from NameNode
[ https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-592: --- Attachment: newGS.patch This patch adds a new API getNewGenerationStampAndAccessToken to ClientProtocol. It fetches a new generation stamp and an access token for an under construction block. > Allow client to get a new generation stamp from NameNode > > > Key: HDFS-592 > URL: https://issues.apache.org/jira/browse/HDFS-592 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: Append Branch >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: Append Branch > > Attachments: newGS.patch > > > This issue aims to add an API to ClientProtocol that fetches a new > generation stamp and an access token from NameNode to support append or > pipeline recovery. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-612) FSDataset should not use org.mortbay.log.Log
FSDataset should not use org.mortbay.log.Log Key: HDFS-612 URL: https://issues.apache.org/jira/browse/HDFS-612 Project: Hadoop HDFS Issue Type: Bug Components: data-node Affects Versions: 0.21.0 Reporter: Tsz Wo (Nicholas), SZE Assignee: Tsz Wo (Nicholas), SZE Fix For: 0.21.0 There are some codes in FSDataset using org.mortbay.log.Log. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-612) FSDataset should not use org.mortbay.log.Log
[ https://issues.apache.org/jira/browse/HDFS-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-612: Attachment: h612_20090911.patch h612_20090911.patch: uses DataNode.LOG for logging. > FSDataset should not use org.mortbay.log.Log > > > Key: HDFS-612 > URL: https://issues.apache.org/jira/browse/HDFS-612 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.21.0 >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 0.21.0 > > Attachments: h612_20090911.patch > > > There are some codes in FSDataset using org.mortbay.log.Log. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-592) Allow client to get a new generation stamp from NameNode
[ https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-592: --- Attachment: newGS1.patch Kan suggested to make the return type of the API to be LocatedBlock that encapsulates the new GS and access token. Here is the patch. > Allow client to get a new generation stamp from NameNode > > > Key: HDFS-592 > URL: https://issues.apache.org/jira/browse/HDFS-592 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: Append Branch >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: Append Branch > > Attachments: newGS.patch, newGS1.patch > > > This issue aims to add an API to ClientProtocol that fetches a new > generation stamp and an access token from NameNode to support append or > pipeline recovery. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-550) DataNode restarts may introduce corrupt/duplicated/lost replicas when handling detached replicas
[ https://issues.apache.org/jira/browse/HDFS-550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754403#action_12754403 ] dhruba borthakur commented on HDFS-550: --- Hi hairong, sorry for not getting back to you earlier. I like the option 2 that you have implemented in this patch. Option 1 is not safe to do on Windows platform. > DataNode restarts may introduce corrupt/duplicated/lost replicas when > handling detached replicas > > > Key: HDFS-550 > URL: https://issues.apache.org/jira/browse/HDFS-550 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node >Affects Versions: 0.21.0 >Reporter: Hairong Kuang >Assignee: Hairong Kuang >Priority: Blocker > Fix For: Append Branch > > Attachments: detach.patch > > > Current trunk first calls detach to unlinks a finalized replica before > appending to this block. Unlink is done by temporally copying the block file > in the "current" subtree to a directory called "detach" under the volume's > daa directory and then copies it back when unlink succeeds. On datanode > restarts, datanodes recover faied unlink by copying replicas under "detach" > to "current". > There are two bugs with this implementation: > 1. The "detach" directory does not include in a snapshot. so rollback will > cause the "detaching" replicas to be lost. > 2. After a replica is copied to the "detach" directory, the information of > its original location is lost. The current implementation erroneously assumes > that the replica to be unlinked is under "current". This will make two > instances of replicas with the same block id to coexist in a datanode. Also > if a replica under "detach" is corrupt, the corrupt replica is moved to > "current" without being detected, polluting datanode data. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-574) Hadoop Doc Split: HDFS Docs
[ https://issues.apache.org/jira/browse/HDFS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corinne Chandel updated HDFS-574: - I created the wiki page (no content added): http://wiki.apache.org/hadoop/HDFS/FAQ > Hadoop Doc Split: HDFS Docs > --- > > Key: HDFS-574 > URL: https://issues.apache.org/jira/browse/HDFS-574 > Project: Hadoop HDFS > Issue Type: Task > Components: documentation >Affects Versions: 0.21.0 >Reporter: Corinne Chandel >Assignee: Owen O'Malley >Priority: Blocker > Attachments: Hadoop-Doc-Split.doc, HDFS-574-hdfs.patch > > > Hadoop Doc Split: HDFS Docs > Please note that I am unable to directly check all of the new links. Some > links may break and will need to be updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-606) ConcurrentModificationException in invalidateCorruptReplicas()
[ https://issues.apache.org/jira/browse/HDFS-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754411#action_12754411 ] Hudson commented on HDFS-606: - Integrated in Hadoop-Hdfs-trunk-Commit #29 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Hdfs-trunk-Commit/29/]) . Fix ConcurrentModificationException in invalidateCorruptReplicas(). Contributed by Konstantin Shvachko. > ConcurrentModificationException in invalidateCorruptReplicas() > -- > > Key: HDFS-606 > URL: https://issues.apache.org/jira/browse/HDFS-606 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 0.21.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Fix For: 0.21.0 > > Attachments: CMEinCorruptReplicas.patch > > > {{BlockManager.invalidateCorruptReplicas()}} iterates over > DatanodeDescriptor-s while removing corrupt replicas from the descriptors. > This causes {{ConcurrentModificationException}} if there is more than one > replicas of the block. I ran into this exception debugging different > scenarios in append, but it should be fixed in the trunk too. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-613) TestBalancer fails with -ea option.
TestBalancer fails with -ea option. --- Key: HDFS-613 URL: https://issues.apache.org/jira/browse/HDFS-613 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Fix For: 0.21.0 Running TestBalancer with asserts on. The asserts in {{Balancer.chooseNode()}} is triggered and the test fails. We do not see it in the builds because asserts are off there. So either the assert is irrelevant or there is another bug in the Balancer code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-589) Change block write protocol to support pipeline recovery
[ https://issues.apache.org/jira/browse/HDFS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754414#action_12754414 ] Suresh Srinivas commented on HDFS-589: -- Comments: # DataTransferProtocol - add a comment to BlockConstructionStage.valueOf() with {{isRecovery()}} to describe it handles both regular and recovery code. Also reordering any of the enums can break this logic. A a comment to preserve the order. # DataStreamer for append has {{if (freeInLastBlock > blockSize)}} check. It will never be true right? > Change block write protocol to support pipeline recovery > > > Key: HDFS-589 > URL: https://issues.apache.org/jira/browse/HDFS-589 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: Append Branch >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: Append Branch > > Attachments: opWriteProtocol.patch > > > Current block write operation's header has the following fields: > blockId blockGS pipelineSize isRecovery clientName hasSource source > #datanodesInDownStreamPipeline downstreamDatanodes > I'd like to change the header to be > blockId blockGS pipelineSize clientName flags blockMinLen blockMaxLen newGS > hasSource source #datanodesInDownStreamPipeline downstreamDatanodes > With this protocol change, pipeline recovery will be performed when a mew > pipeline is set up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-222) Support for concatenating of files into a single file
[ https://issues.apache.org/jira/browse/HDFS-222?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Boris Shkolnik reassigned HDFS-222: --- Assignee: Boris Shkolnik > Support for concatenating of files into a single file > - > > Key: HDFS-222 > URL: https://issues.apache.org/jira/browse/HDFS-222 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Venkatesh S >Assignee: Boris Shkolnik > > An API to concatenate files of same size and replication factor on HDFS into > a single larger file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-612) FSDataset should not use org.mortbay.log.Log
[ https://issues.apache.org/jira/browse/HDFS-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754415#action_12754415 ] Boris Shkolnik commented on HDFS-612: - +1 > FSDataset should not use org.mortbay.log.Log > > > Key: HDFS-612 > URL: https://issues.apache.org/jira/browse/HDFS-612 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.21.0 >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 0.21.0 > > Attachments: h612_20090911.patch > > > There are some codes in FSDataset using org.mortbay.log.Log. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-612) FSDataset should not use org.mortbay.log.Log
[ https://issues.apache.org/jira/browse/HDFS-612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754417#action_12754417 ] Boris Shkolnik commented on HDFS-612: - could you also remove "" from line 1504 > FSDataset should not use org.mortbay.log.Log > > > Key: HDFS-612 > URL: https://issues.apache.org/jira/browse/HDFS-612 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.21.0 >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 0.21.0 > > Attachments: h612_20090911.patch > > > There are some codes in FSDataset using org.mortbay.log.Log. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-613) TestBalancer fails with -ea option.
[ https://issues.apache.org/jira/browse/HDFS-613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754418#action_12754418 ] Konstantin Shvachko commented on HDFS-613: -- This is the assertion: {{AssertionError: Mismatched number of datanodes}} It is in {{Balancer.chooseNode()}} method wthout parameters. > TestBalancer fails with -ea option. > --- > > Key: HDFS-613 > URL: https://issues.apache.org/jira/browse/HDFS-613 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.21.0 >Reporter: Konstantin Shvachko > Fix For: 0.21.0 > > > Running TestBalancer with asserts on. The asserts in > {{Balancer.chooseNode()}} is triggered and the test fails. We do not see it > in the builds because asserts are off there. So either the assert is > irrelevant or there is another bug in the Balancer code. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-612) FSDataset should not use org.mortbay.log.Log
[ https://issues.apache.org/jira/browse/HDFS-612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-612: Attachment: h612_20090911b.patch h612_20090911b.patch: removed "" > FSDataset should not use org.mortbay.log.Log > > > Key: HDFS-612 > URL: https://issues.apache.org/jira/browse/HDFS-612 > Project: Hadoop HDFS > Issue Type: Bug > Components: data-node >Affects Versions: 0.21.0 >Reporter: Tsz Wo (Nicholas), SZE >Assignee: Tsz Wo (Nicholas), SZE > Fix For: 0.21.0 > > Attachments: h612_20090911.patch, h612_20090911b.patch > > > There are some codes in FSDataset using org.mortbay.log.Log. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-574) Hadoop Doc Split: HDFS Docs
[ https://issues.apache.org/jira/browse/HDFS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corinne Chandel updated HDFS-574: - Attachment: Hadoop-Doc-Split-2.doc > Hadoop Doc Split: HDFS Docs > --- > > Key: HDFS-574 > URL: https://issues.apache.org/jira/browse/HDFS-574 > Project: Hadoop HDFS > Issue Type: Task > Components: documentation >Affects Versions: 0.21.0 >Reporter: Corinne Chandel >Assignee: Owen O'Malley >Priority: Blocker > Attachments: Hadoop-Doc-Split-2.doc, Hadoop-Doc-Split.doc, > HDFS-574-hdfs.patch > > > Hadoop Doc Split: HDFS Docs > Please note that I am unable to directly check all of the new links. Some > links may break and will need to be updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-614) TestDatanodeBlockScanner should data-node directories directly from MiniDFSCluster
TestDatanodeBlockScanner should data-node directories directly from MiniDFSCluster -- Key: HDFS-614 URL: https://issues.apache.org/jira/browse/HDFS-614 Project: Hadoop HDFS Issue Type: Bug Components: test Affects Versions: 0.21.0 Reporter: Konstantin Shvachko Fix For: 0.21.0 TestDatanodeBlockScanner relies on that data-node directories are listed in {{test.build.data}}, which is not true if the test run from eclipse. It shold get the directories directly from {{MiniDFSCluster}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-614) TestDatanodeBlockScanner should data-node directories directly from MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-614: - Attachment: TestDNBlockScanner.patch Here is the patch that does it. I also added assertions to make sure that blocks are really corrupted. This makes the test fail rather than loop indefinitely in case the block was not corrupted. > TestDatanodeBlockScanner should data-node directories directly from > MiniDFSCluster > -- > > Key: HDFS-614 > URL: https://issues.apache.org/jira/browse/HDFS-614 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.21.0 >Reporter: Konstantin Shvachko > Fix For: 0.21.0 > > Attachments: TestDNBlockScanner.patch > > > TestDatanodeBlockScanner relies on that data-node directories are listed in > {{test.build.data}}, which is not true if the test run from eclipse. It shold > get the directories directly from {{MiniDFSCluster}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-614) TestDatanodeBlockScanner should data-node directories directly from MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Konstantin Shvachko updated HDFS-614: - Assignee: Konstantin Shvachko Status: Patch Available (was: Open) > TestDatanodeBlockScanner should data-node directories directly from > MiniDFSCluster > -- > > Key: HDFS-614 > URL: https://issues.apache.org/jira/browse/HDFS-614 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.21.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Fix For: 0.21.0 > > Attachments: TestDNBlockScanner.patch > > > TestDatanodeBlockScanner relies on that data-node directories are listed in > {{test.build.data}}, which is not true if the test run from eclipse. It shold > get the directories directly from {{MiniDFSCluster}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-574) Hadoop Doc Split: HDFS Docs
[ https://issues.apache.org/jira/browse/HDFS-574?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754432#action_12754432 ] Corinne Chandel commented on HDFS-574: -- See the attached updated file: Hadoop-Doc-Split-2.doc After you apply the patch, you may need to manually delete some XML files that don't belong with HDFS 0.21. Only these 11 XML doc files beonb with HDFS for 0.21 release: 1. faultinject_framework.xml 2. hdfs_design.xml 3. hdfs_imageviewer.xml 4. hdfs_permissions_guide.xml 5. hdfs_quota_admin_guide.xml 6. hdfs_user_guide.xml 7. libhdfs.xml 8. SLG_user_guide.xml 9. index.xml (overview) 10. site.xml 11. tabs.xml > Hadoop Doc Split: HDFS Docs > --- > > Key: HDFS-574 > URL: https://issues.apache.org/jira/browse/HDFS-574 > Project: Hadoop HDFS > Issue Type: Task > Components: documentation >Affects Versions: 0.21.0 >Reporter: Corinne Chandel >Assignee: Owen O'Malley >Priority: Blocker > Attachments: Hadoop-Doc-Split-2.doc, Hadoop-Doc-Split.doc, > HDFS-574-hdfs.patch > > > Hadoop Doc Split: HDFS Docs > Please note that I am unable to directly check all of the new links. Some > links may break and will need to be updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-592) Allow client to get a new generation stamp from NameNode
[ https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754445#action_12754445 ] Kan Zhang commented on HDFS-592: Namenode needs to verify that the requesting client is the client that has previously been authorized to write to the Block. Otherwise, this can become a security hole. This checking is missing in existing code (it was hard to do since in existing code recovery is done at datanode). We probably need open a new JIRA for this. For now you may want to let the client send the clientname it used in the create() call and check that the DFSClient instance is the leaseholder. However, this may not solve the problem since clientname may be guessed. For security purposes, the checking should be based on an authenticated username. Also, can we choose a method name other than getNewGenerationStampAndAccessToken()? In my view, the namenode is not doing this as a general service to any client that wants an access token. This is done only in the context of pipeline recovery. How about using something like pipelineRecovery()? > Allow client to get a new generation stamp from NameNode > > > Key: HDFS-592 > URL: https://issues.apache.org/jira/browse/HDFS-592 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: Append Branch >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: Append Branch > > Attachments: newGS.patch, newGS1.patch > > > This issue aims to add an API to ClientProtocol that fetches a new > generation stamp and an access token from NameNode to support append or > pipeline recovery. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-592) Allow client to get a new generation stamp from NameNode
[ https://issues.apache.org/jira/browse/HDFS-592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754453#action_12754453 ] Hairong Kuang commented on HDFS-592: > For now you may want to let the client send the clientname it used in the > create() call and check that the DFSClient instance is the leaseholder. +1 > This is done only in the context of pipeline recovery. How about using > something like pipelineRecovery()? Choosing a right name is always hard. This is used in both pipeline recovery and setting up the initial pipeline for append. > Allow client to get a new generation stamp from NameNode > > > Key: HDFS-592 > URL: https://issues.apache.org/jira/browse/HDFS-592 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: Append Branch >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: Append Branch > > Attachments: newGS.patch, newGS1.patch > > > This issue aims to add an API to ClientProtocol that fetches a new > generation stamp and an access token from NameNode to support append or > pipeline recovery. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-589) Change block write protocol to support pipeline recovery
[ https://issues.apache.org/jira/browse/HDFS-589?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754457#action_12754457 ] Tsz Wo (Nicholas), SZE commented on HDFS-589: - - I think it is better to change {code} +public static BlockConstructionStage valueOf( +byte code, boolean isRecovery) { + return valueOf((byte)(isRecovery ? (code|RECOVERY_BIT) : code)); +} {code} to {code} public BlockConstructionStage combine(boolean isRecovery) { return isRecovery? valueOf(ordinal()|RECOVERY_BIT): this; } {code} Then, the codes in DFSClient becomes {code} BlockConstructionStage blockStage = stage.combine(recoveryFlag); {code} - In line 737 of the patch, {code} +// stage = BlockConstructionStage.PIPELINE_CLOSE; {code} Should it be un-commented? > Change block write protocol to support pipeline recovery > > > Key: HDFS-589 > URL: https://issues.apache.org/jira/browse/HDFS-589 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: Append Branch >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: Append Branch > > Attachments: opWriteProtocol.patch > > > Current block write operation's header has the following fields: > blockId blockGS pipelineSize isRecovery clientName hasSource source > #datanodesInDownStreamPipeline downstreamDatanodes > I'd like to change the header to be > blockId blockGS pipelineSize clientName flags blockMinLen blockMaxLen newGS > hasSource source #datanodesInDownStreamPipeline downstreamDatanodes > With this protocol change, pipeline recovery will be performed when a mew > pipeline is set up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-615) TestLargeDirectoryDelete fails with NullPointerException
TestLargeDirectoryDelete fails with NullPointerException Key: HDFS-615 URL: https://issues.apache.org/jira/browse/HDFS-615 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 0.21.0 Environment: 64-bit debian 5, 64-bit sun java6, running in a single processor VM. Reporter: Eli Collins Priority: Minor I've hit the following failure two out of two times running "ant test" at rev 813587. This test doesn't appear to be failing on hudson. All other tests passed except TestHDFSFileSystemContract which timed out, so perhaps there's a race due to the test executing slowly. [junit] Running org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete [junit] Exception in thread "Thread-30148" java.lang.NullPointerException [junit] at org.apache.hadoop.hdfs.server.namenode.NameNodeAdapter.getNamesystem(NameNodeAdapter.java:32) [junit] at org.apache.hadoop.hdfs.MiniDFSCluster.getNamesystem(MiniDFSCluster.java:522) [junit] at org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete.getBlockCount(TestLargeDirectoryDelete.java:75) [junit] at org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete.access$000(TestLargeDirectoryDelete.java:38) [junit] at org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete$1.run(TestLargeDirectoryDelete.java:90) [junit] Tests run: 1, Failures: 0, Errors: 0, Time elapsed: 94.264 sec No failures or errors? public static FSNamesystem getNamesystem(NameNode namenode) { return namenode.getNamesystem(); <=== } -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-614) TestDatanodeBlockScanner should data-node directories directly from MiniDFSCluster
[ https://issues.apache.org/jira/browse/HDFS-614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12754472#action_12754472 ] Hadoop QA commented on HDFS-614: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12419357/TestDNBlockScanner.patch against trunk revision 814047. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. +1 contrib tests. The patch passed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/22/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/22/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/22/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/22/console This message is automatically generated. > TestDatanodeBlockScanner should data-node directories directly from > MiniDFSCluster > -- > > Key: HDFS-614 > URL: https://issues.apache.org/jira/browse/HDFS-614 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 0.21.0 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Fix For: 0.21.0 > > Attachments: TestDNBlockScanner.patch > > > TestDatanodeBlockScanner relies on that data-node directories are listed in > {{test.build.data}}, which is not true if the test run from eclipse. It shold > get the directories directly from {{MiniDFSCluster}}. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.