[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
[ https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840051#action_12840051 ] Hadoop QA commented on HDFS-1001: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org against trunk revision 916902. +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. -1 patch. The patch command could not apply the patch. Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/257/console This message is automatically generated. > DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK > - > > Key: HDFS-1001 > URL: https://issues.apache.org/jira/browse/HDFS-1001 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: bc Wong > > Running the TestPread with additional debug statements reveals that the > BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. > Currently it doesn't matter since DataXceiver closes the connection after > each op, and CHECKSUM_OK is the last thing on the wire. But if we want to > cache connections, they need to agree on the exchange of CHECKSUM_OK. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
[ https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840048#action_12840048 ] bc Wong commented on HDFS-1001: --- I have a patch. But I can't assign this issue to myself. Could someone please fix Jira to let me work on it? Thanks. > DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK > - > > Key: HDFS-1001 > URL: https://issues.apache.org/jira/browse/HDFS-1001 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: bc Wong > > Running the TestPread with additional debug statements reveals that the > BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. > Currently it doesn't matter since DataXceiver closes the connection after > each op, and CHECKSUM_OK is the last thing on the wire. But if we want to > cache connections, they need to agree on the exchange of CHECKSUM_OK. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1001) DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK
[ https://issues.apache.org/jira/browse/HDFS-1001?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] bc Wong updated HDFS-1001: -- Status: Patch Available (was: Open) > DataXceiver and BlockReader disagree on when to send/recv CHECKSUM_OK > - > > Key: HDFS-1001 > URL: https://issues.apache.org/jira/browse/HDFS-1001 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.22.0 >Reporter: bc Wong > > Running the TestPread with additional debug statements reveals that the > BlockReader sends CHECKSUM_OK when the DataXceiver doesn't expect it. > Currently it doesn't matter since DataXceiver closes the connection after > each op, and CHECKSUM_OK is the last thing on the wire. But if we want to > cache connections, they need to agree on the exchange of CHECKSUM_OK. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-984) Delegation Tokens should be persisted in Namenode
[ https://issues.apache.org/jira/browse/HDFS-984?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840041#action_12840041 ] Hudson commented on HDFS-984: - Integrated in Hadoop-Mapreduce-trunk-Commit #252 (See [http://hudson.zones.apache.org/hudson/job/Hadoop-Mapreduce-trunk-Commit/252/]) > Delegation Tokens should be persisted in Namenode > - > > Key: HDFS-984 > URL: https://issues.apache.org/jira/browse/HDFS-984 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Fix For: 0.22.0 > > Attachments: HDFS-984-0_20.4.patch, HDFS-984.10.patch, > HDFS-984.11.patch, HDFS-984.12.patch, HDFS-984.14.patch, HDFS-984.7.patch > > > The Delegation tokens should be persisted in the FsImage and EditLogs so that > they are valid to be used after namenode shutdown and restart. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-204) Revive number of files listed metrics
[ https://issues.apache.org/jira/browse/HDFS-204?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-204: --- Attachment: getFileNum-yahoo20.patch This patch ported the feature to Yahoo 20 branch. In addition, it fixed the bug that NPE will be thrown when getListing on a non-existent path and I also added two more test cases, one is listing a non-existent path and one is listing path represented a file. > Revive number of files listed metrics > - > > Key: HDFS-204 > URL: https://issues.apache.org/jira/browse/HDFS-204 > Project: Hadoop HDFS > Issue Type: New Feature > Components: name-node >Affects Versions: 0.21.0 >Reporter: Koji Noguchi >Assignee: Jitendra Nath Pandey > Fix For: 0.21.0 > > Attachments: getFileNum-yahoo20.patch, HDFS-204-2.patch, > HDFS-204-2.patch, HDFS-204.patch, HDFS-204.patch > > > When namenode becomes unresponsive by HADOOP-4693 (large filelist calls), > metrics has been helpful in finding out the cause. > When gc time hikes, "FileListed" metrics also hiked. > In 0.18, after we *fixed* "FileListed" metrics so that it shows number of > operations instead of number of files listed (HADOOP-3683), I stopped seeing > this relationship graph. > Can we bring back "NumbverOfFilesListed" metrics? -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.
[ https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12840009#action_12840009 ] Hadoop QA commented on HDFS-1014: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437557/HDFS-1014.2.patch against trunk revision 916902. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h2.grid.sp2.yahoo.net/120/console This message is automatically generated. > Error in reading delegation tokens from edit logs. > -- > > Key: HDFS-1014 > URL: https://issues.apache.org/jira/browse/HDFS-1014 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch > > > When delegation tokens are read from the edit logs...same object is used to > read the identifier and is stored in the token cache. This is wrong because > same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-729) fsck option to list only corrupted files
[ https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839975#action_12839975 ] Rodrigo Schmidt commented on HDFS-729: -- The errors are the same as before, and the same other patches seem to be going through. Dhruba, could you please double check that everything is fine with this patch? > fsck option to list only corrupted files > > > Key: HDFS-729 > URL: https://issues.apache.org/jira/browse/HDFS-729 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: dhruba borthakur >Assignee: Rodrigo Schmidt > Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, > HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch > > > An option to fsck to list only corrupted files will be very helpful for > frequent monitoring. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-729) fsck option to list only corrupted files
[ https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839974#action_12839974 ] Hadoop QA commented on HDFS-729: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12437530/HDFS-729.3.patch against trunk revision 916902. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 6 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/256/console This message is automatically generated. > fsck option to list only corrupted files > > > Key: HDFS-729 > URL: https://issues.apache.org/jira/browse/HDFS-729 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: dhruba borthakur >Assignee: Rodrigo Schmidt > Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, > HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch > > > An option to fsck to list only corrupted files will be very helpful for > frequent monitoring. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-988) saveNamespace can corrupt edits log
[ https://issues.apache.org/jira/browse/HDFS-988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-988: -- Tags: hbase > saveNamespace can corrupt edits log > --- > > Key: HDFS-988 > URL: https://issues.apache.org/jira/browse/HDFS-988 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Reporter: dhruba borthakur > Attachments: saveNamespace.txt > > > The adminstrator puts the namenode is safemode and then issues the > savenamespace command. This can corrupt the edits log. The problem is that > when the NN enters safemode, there could still be pending logSycs occuring > from other threads. Now, the saveNamespace command, when executed, would save > a edits log with partial writes. I have seen this happen on 0.20. > https://issues.apache.org/jira/browse/HDFS-909?focusedCommentId=12828853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12828853 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.
[ https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1014: --- Status: Patch Available (was: Open) > Error in reading delegation tokens from edit logs. > -- > > Key: HDFS-1014 > URL: https://issues.apache.org/jira/browse/HDFS-1014 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch > > > When delegation tokens are read from the edit logs...same object is used to > read the identifier and is stored in the token cache. This is wrong because > same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.
[ https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839962#action_12839962 ] Jitendra Nath Pandey commented on HDFS-1014: HDFS-1014.2.patch is for trunk. > Error in reading delegation tokens from edit logs. > -- > > Key: HDFS-1014 > URL: https://issues.apache.org/jira/browse/HDFS-1014 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch > > > When delegation tokens are read from the edit logs...same object is used to > read the identifier and is stored in the token cache. This is wrong because > same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.
[ https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1014: --- Attachment: HDFS-1014.2.patch > Error in reading delegation tokens from edit logs. > -- > > Key: HDFS-1014 > URL: https://issues.apache.org/jira/browse/HDFS-1014 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: HDFS-1014-y20.1.patch, HDFS-1014.2.patch > > > When delegation tokens are read from the edit logs...same object is used to > read the identifier and is stored in the token cache. This is wrong because > same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1014) Error in reading delegation tokens from edit logs.
[ https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839944#action_12839944 ] Konstantin Shvachko commented on HDFS-1014: --- +1 Patch looks good to me. > Error in reading delegation tokens from edit logs. > -- > > Key: HDFS-1014 > URL: https://issues.apache.org/jira/browse/HDFS-1014 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: HDFS-1014-y20.1.patch > > > When delegation tokens are read from the edit logs...same object is used to > read the identifier and is stored in the token cache. This is wrong because > same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Assigned: (HDFS-1014) Error in reading delegation tokens from edit logs.
[ https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey reassigned HDFS-1014: -- Assignee: Jitendra Nath Pandey > Error in reading delegation tokens from edit logs. > -- > > Key: HDFS-1014 > URL: https://issues.apache.org/jira/browse/HDFS-1014 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jitendra Nath Pandey >Assignee: Jitendra Nath Pandey > Attachments: HDFS-1014-y20.1.patch > > > When delegation tokens are read from the edit logs...same object is used to > read the identifier and is stored in the token cache. This is wrong because > same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1014) Error in reading delegation tokens from edit logs.
[ https://issues.apache.org/jira/browse/HDFS-1014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jitendra Nath Pandey updated HDFS-1014: --- Attachment: HDFS-1014-y20.1.patch Patch for hadoop-20 is uploaded. > Error in reading delegation tokens from edit logs. > -- > > Key: HDFS-1014 > URL: https://issues.apache.org/jira/browse/HDFS-1014 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Jitendra Nath Pandey > Attachments: HDFS-1014-y20.1.patch > > > When delegation tokens are read from the edit logs...same object is used to > read the identifier and is stored in the token cache. This is wrong because > same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1014) Error in reading delegation tokens from edit logs.
Error in reading delegation tokens from edit logs. -- Key: HDFS-1014 URL: https://issues.apache.org/jira/browse/HDFS-1014 Project: Hadoop HDFS Issue Type: Bug Reporter: Jitendra Nath Pandey When delegation tokens are read from the edit logs...same object is used to read the identifier and is stored in the token cache. This is wrong because same object is getting updated. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-458) Create target for 10 minute patch test build for hdfs
[ https://issues.apache.org/jira/browse/HDFS-458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Erik Steffl updated HDFS-458: - Attachment: jira.HDFS-458.branch-0.21.1xx.patch > Create target for 10 minute patch test build for hdfs > - > > Key: HDFS-458 > URL: https://issues.apache.org/jira/browse/HDFS-458 > Project: Hadoop HDFS > Issue Type: New Feature > Components: build, test >Reporter: Jakob Homan >Assignee: Jakob Homan > Fix For: 0.21.0 > > Attachments: build.xml, HDFS-458.patch, HDFS-458.patch, > jira.HDFS-458.branch-0.21.1xx.patch, TenMinuteTestData.xlsx > > > It would be good to identify a subset of hdfs tests that provide strong test > code coverage within 10 minutes, as is the goal of MAPREDUCE-670 and > HADOOP-5628. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-729) fsck option to list only corrupted files
[ https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rodrigo Schmidt updated HDFS-729: - Status: Patch Available (was: Open) > fsck option to list only corrupted files > > > Key: HDFS-729 > URL: https://issues.apache.org/jira/browse/HDFS-729 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: dhruba borthakur >Assignee: Rodrigo Schmidt > Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, > HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch > > > An option to fsck to list only corrupted files will be very helpful for > frequent monitoring. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-729) fsck option to list only corrupted files
[ https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rodrigo Schmidt updated HDFS-729: - Attachment: HDFS-729.3.patch new patch attached. (3 is my lucky number) I made all the changes suggested by Dhruba. > fsck option to list only corrupted files > > > Key: HDFS-729 > URL: https://issues.apache.org/jira/browse/HDFS-729 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: dhruba borthakur >Assignee: Rodrigo Schmidt > Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, > HDFS-729.1.patch, HDFS-729.2.patch, HDFS-729.3.patch > > > An option to fsck to list only corrupted files will be very helpful for > frequent monitoring. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1013) Miscellaneous improvements to HTML markup for web UIs
Miscellaneous improvements to HTML markup for web UIs - Key: HDFS-1013 URL: https://issues.apache.org/jira/browse/HDFS-1013 Project: Hadoop HDFS Issue Type: Improvement Reporter: Todd Lipcon Priority: Minor The Web UIs have various bits of bad markup (eg missing sections, some pages missing CSS links, inconsistent td vs th for table headings). We should fix this up. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839875#action_12839875 ] Konstantin Shvachko commented on HDFS-826: -- What is the point of introducing new {{Replicable}} interface if it is not used anywhere? The new method {{getNumCurrentReplicas()}} in {{FSOutputSummer}} would work fine. > Allow a mechanism for an application to detect that datanode(s) have died in > the write pipeline > > > Key: HDFS-826 > URL: https://issues.apache.org/jira/browse/HDFS-826 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: dhruba borthakur >Assignee: dhruba borthakur > Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, > ReplicableHdfs3.txt > > > HDFS does not replicate the last block of the file that is being currently > written to by an application. Every datanode death in the write pipeline > decreases the reliability of the last block of the currently-being-written > block. This situation can be improved if the application can be notified of a > datanode death in the write pipeline. Then, the application can decide what > is the right course of action to be taken on this event. > In our use-case, the application can close the file on the first datanode > death, and start writing to a newly created file. This ensures that the > reliability guarantee of a block is close to 3 at all time. > One idea is to make DFSOutoutStream. write() throw an exception if the number > of datanodes in the write pipeline fall below minimum.replication.factor that > is set on the client (this is backward compatible). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-898) Sequential generation of block ids
[ https://issues.apache.org/jira/browse/HDFS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839870#action_12839870 ] Konstantin Shvachko commented on HDFS-898: -- Great! Yet another cluster would have had its block ids converted using 8 bit projection collision free. Thanks Dmytro. > Sequential generation of block ids > -- > > Key: HDFS-898 > URL: https://issues.apache.org/jira/browse/HDFS-898 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.20.1 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Fix For: 0.22.0 > > Attachments: DuplicateBlockIds.patch, FreeBlockIds.pdf, > HighBitProjection.pdf > > > This is a proposal to replace random generation of block ids with a > sequential generator in order to avoid block id reuse in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839858#action_12839858 ] Hadoop QA commented on HDFS-826: -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12436152/ReplicableHdfs3.txt against trunk revision 916902. +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 findbugs. The patch does not introduce any new Findbugs warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed core unit tests. -1 contrib tests. The patch failed contrib unit tests. Test results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/testReport/ Findbugs warnings: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html Checkstyle results: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/artifact/trunk/build/test/checkstyle-errors.html Console output: http://hudson.zones.apache.org/hudson/job/Hdfs-Patch-h5.grid.sp2.yahoo.net/255/console This message is automatically generated. > Allow a mechanism for an application to detect that datanode(s) have died in > the write pipeline > > > Key: HDFS-826 > URL: https://issues.apache.org/jira/browse/HDFS-826 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: dhruba borthakur >Assignee: dhruba borthakur > Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, > ReplicableHdfs3.txt > > > HDFS does not replicate the last block of the file that is being currently > written to by an application. Every datanode death in the write pipeline > decreases the reliability of the last block of the currently-being-written > block. This situation can be improved if the application can be notified of a > datanode death in the write pipeline. Then, the application can decide what > is the right course of action to be taken on this event. > In our use-case, the application can close the file on the first datanode > death, and start writing to a newly created file. This ensures that the > reliability guarantee of a block is close to 3 at all time. > One idea is to make DFSOutoutStream. write() throw an exception if the number > of datanodes in the write pipeline fall below minimum.replication.factor that > is set on the client (this is backward compatible). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839853#action_12839853 ] dhruba borthakur commented on HDFS-826: --- Thanks Todd for the review. I agree with your recommendation and will post a new patch. > Allow a mechanism for an application to detect that datanode(s) have died in > the write pipeline > > > Key: HDFS-826 > URL: https://issues.apache.org/jira/browse/HDFS-826 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: dhruba borthakur >Assignee: dhruba borthakur > Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, > ReplicableHdfs3.txt > > > HDFS does not replicate the last block of the file that is being currently > written to by an application. Every datanode death in the write pipeline > decreases the reliability of the last block of the currently-being-written > block. This situation can be improved if the application can be notified of a > datanode death in the write pipeline. Then, the application can decide what > is the right course of action to be taken on this event. > In our use-case, the application can close the file on the first datanode > death, and start writing to a newly created file. This ensures that the > reliability guarantee of a block is close to 3 at all time. > One idea is to make DFSOutoutStream. write() throw an exception if the number > of datanodes in the write pipeline fall below minimum.replication.factor that > is set on the client (this is backward compatible). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1007) HFTP needs to be updated to use delegation tokens
[ https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839844#action_12839844 ] Kan Zhang commented on HDFS-1007: - The delegation token should be fetched by distcp client, not by HftpFilesystem (or HsftpFilesystem). > HFTP needs to be updated to use delegation tokens > - > > Key: HDFS-1007 > URL: https://issues.apache.org/jira/browse/HDFS-1007 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 0.22.0 >Reporter: Devaraj Das > Fix For: 0.22.0 > > Attachments: distcp-hftp.1.patch, distcp-hftp.2.1.patch, > distcp-hftp.2.patch, distcp-hftp.patch > > > HFTPFileSystem should be updated to use the delegation tokens so that it can > talk to the secure namenodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839837#action_12839837 ] Todd Lipcon commented on HDFS-826: -- Patch looks good to me. I question whether returning 0 for the case in between blocks is a good idea - this seems a bit confusing from the API user's perspective. Since it is well documented it may not be an issue, but I wonder if it would make more sense to actually return the intended replication in this case. > Allow a mechanism for an application to detect that datanode(s) have died in > the write pipeline > > > Key: HDFS-826 > URL: https://issues.apache.org/jira/browse/HDFS-826 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: dhruba borthakur >Assignee: dhruba borthakur > Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, > ReplicableHdfs3.txt > > > HDFS does not replicate the last block of the file that is being currently > written to by an application. Every datanode death in the write pipeline > decreases the reliability of the last block of the currently-being-written > block. This situation can be improved if the application can be notified of a > datanode death in the write pipeline. Then, the application can decide what > is the right course of action to be taken on this event. > In our use-case, the application can close the file on the first datanode > death, and start writing to a newly created file. This ensures that the > reliability guarantee of a block is close to 3 at all time. > One idea is to make DFSOutoutStream. write() throw an exception if the number > of datanodes in the write pipeline fall below minimum.replication.factor that > is set on the client (this is backward compatible). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-898) Sequential generation of block ids
[ https://issues.apache.org/jira/browse/HDFS-898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839805#action_12839805 ] Dmytro Molkov commented on HDFS-898: I just ran a tool on FB cluster. Here is the output: Bit map applied: ff00 Number of collisions = 0 = Number of blocks = 57909756 Number of negtive ids = 57909756 Number of positive ids = 0 Largest segment = (-277768208, 9223372036854775807) Segment size = 9.223372037132544E18 Expected max = 318542942464 > Sequential generation of block ids > -- > > Key: HDFS-898 > URL: https://issues.apache.org/jira/browse/HDFS-898 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.20.1 >Reporter: Konstantin Shvachko >Assignee: Konstantin Shvachko > Fix For: 0.22.0 > > Attachments: DuplicateBlockIds.patch, FreeBlockIds.pdf, > HighBitProjection.pdf > > > This is a proposal to replace random generation of block ids with a > sequential generator in order to avoid block id reuse in the future. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-985) HDFS should issue multiple RPCs for listing a large directory
[ https://issues.apache.org/jira/browse/HDFS-985?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hairong Kuang updated HDFS-985: --- Attachment: testFileStatus.patch This patch fixed a bug in TestFileStatus.java. > HDFS should issue multiple RPCs for listing a large directory > - > > Key: HDFS-985 > URL: https://issues.apache.org/jira/browse/HDFS-985 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Hairong Kuang >Assignee: Hairong Kuang > Fix For: 0.22.0 > > Attachments: iterativeLS_yahoo.patch, iterativeLS_yahoo1.patch, > testFileStatus.patch > > > Currently HDFS issues one RPC from the client to the NameNode for listing a > directory. However some directories are large that contain thousands or > millions of items. Listing such large directories in one RPC has a few > shortcomings: > 1. The list operation holds the global fsnamesystem lock for a long time thus > blocking other requests. If a large number (like thousands) of such list > requests hit NameNode in a short period of time, NameNode will be > significantly slowed down. Users end up noticing longer response time or lost > connections to NameNode. > 2. The response message is uncontrollable big. We observed a response as big > as 50M bytes when listing a directory of 300 thousand items. Even with the > optimization introduced at HDFS-946 that may be able to cut the response by > 20-50%, the response size will still in the magnitude of 10 mega bytes. > I propose to implement a directory listing using multiple RPCs. Here is the > plan: > 1. Each getListing RPC has an upper limit on the number of items returned. > This limit could be configurable, but I am thinking to set it to be a fixed > number like 500. > 2. Each RPC additionally specifies a start position for this listing request. > I am thinking to use the last item of the previous listing RPC as an > indicator. Since NameNode stores all items in a directory as a sorted array, > NameNode uses the last item to locate the start item of this listing even if > the last item is deleted in between these two consecutive calls. This has the > advantage of avoid duplicate entries at the client side. > 3. The return value additionally specifies if the whole directory is done > listing. If the client sees a false flag, it will continue to issue another > RPC. > This proposal will change the semantics of large directory listing in a sense > that listing is no longer an atomic operation if a directory's content is > changing while the listing operation is in progress. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-826) Allow a mechanism for an application to detect that datanode(s) have died in the write pipeline
[ https://issues.apache.org/jira/browse/HDFS-826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-826: -- Status: Patch Available (was: Open) Can somebody pl review this patch? This is needed to make HBase work efficiently. Thanks. > Allow a mechanism for an application to detect that datanode(s) have died in > the write pipeline > > > Key: HDFS-826 > URL: https://issues.apache.org/jira/browse/HDFS-826 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: dhruba borthakur >Assignee: dhruba borthakur > Attachments: ReplicableHdfs.txt, ReplicableHdfs2.txt, > ReplicableHdfs3.txt > > > HDFS does not replicate the last block of the file that is being currently > written to by an application. Every datanode death in the write pipeline > decreases the reliability of the last block of the currently-being-written > block. This situation can be improved if the application can be notified of a > datanode death in the write pipeline. Then, the application can decide what > is the right course of action to be taken on this event. > In our use-case, the application can close the file on the first datanode > death, and start writing to a newly created file. This ensures that the > reliability guarantee of a block is close to 3 at all time. > One idea is to make DFSOutoutStream. write() throw an exception if the number > of datanodes in the write pipeline fall below minimum.replication.factor that > is set on the client (this is backward compatible). -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-729) fsck option to list only corrupted files
[ https://issues.apache.org/jira/browse/HDFS-729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] dhruba borthakur updated HDFS-729: -- Status: Open (was: Patch Available) Thanks Rodrigo. I will wait for your new patch. > fsck option to list only corrupted files > > > Key: HDFS-729 > URL: https://issues.apache.org/jira/browse/HDFS-729 > Project: Hadoop HDFS > Issue Type: Improvement >Reporter: dhruba borthakur >Assignee: Rodrigo Schmidt > Attachments: badFiles.txt, badFiles2.txt, corruptFiles.txt, > HDFS-729.1.patch, HDFS-729.2.patch > > > An option to fsck to list only corrupted files will be very helpful for > frequent monitoring. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1006) getImage/putImage http requests should be https for the case of security enabled.
[ https://issues.apache.org/jira/browse/HDFS-1006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HDFS-1006: -- Attachment: HDFS-1006-Y20.1.patch Minor updates to the previous patch. > getImage/putImage http requests should be https for the case of security > enabled. > - > > Key: HDFS-1006 > URL: https://issues.apache.org/jira/browse/HDFS-1006 > Project: Hadoop HDFS > Issue Type: Bug >Reporter: Boris Shkolnik >Assignee: Boris Shkolnik > Attachments: HDFS-1006-BP20.patch, HDFS-1006-Y20.1.patch, > HDFS-1006-Y20.patch > > > should use https:// and port 50475 -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1007) HFTP needs to be updated to use delegation tokens
[ https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HDFS-1007: -- Attachment: distcp-hftp.2.1.patch This patch is a bugfix on top of the distcp-hftp.2.patch. > HFTP needs to be updated to use delegation tokens > - > > Key: HDFS-1007 > URL: https://issues.apache.org/jira/browse/HDFS-1007 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 0.22.0 >Reporter: Devaraj Das > Fix For: 0.22.0 > > Attachments: distcp-hftp.1.patch, distcp-hftp.2.1.patch, > distcp-hftp.2.patch, distcp-hftp.patch > > > HFTPFileSystem should be updated to use the delegation tokens so that it can > talk to the secure namenodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Commented: (HDFS-1005) Fsck security
[ https://issues.apache.org/jira/browse/HDFS-1005?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12839597#action_12839597 ] Boris Shkolnik commented on HDFS-1005: -- HDFS-1005-BP20.patch for previous version of Hadoop. Not for commit. > Fsck security > - > > Key: HDFS-1005 > URL: https://issues.apache.org/jira/browse/HDFS-1005 > Project: Hadoop HDFS > Issue Type: New Feature >Reporter: Jitendra Nath Pandey >Assignee: Boris Shkolnik > Attachments: HDFS-1005-BP20.patch, HDFS-1005-y20.1.patch > > > This jira tracks implementation of security for Fsck. Fsck should make an > authenticated connection to the namenode. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (HDFS-1007) HFTP needs to be updated to use delegation tokens
[ https://issues.apache.org/jira/browse/HDFS-1007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Devaraj Das updated HDFS-1007: -- Attachment: distcp-hftp.2.patch Updates HsFtpFileSystem.. Patch for Y20. Not for commit here. > HFTP needs to be updated to use delegation tokens > - > > Key: HDFS-1007 > URL: https://issues.apache.org/jira/browse/HDFS-1007 > Project: Hadoop HDFS > Issue Type: Bug > Components: security >Affects Versions: 0.22.0 >Reporter: Devaraj Das > Fix For: 0.22.0 > > Attachments: distcp-hftp.1.patch, distcp-hftp.2.patch, > distcp-hftp.patch > > > HFTPFileSystem should be updated to use the delegation tokens so that it can > talk to the secure namenodes. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (HDFS-1012) documentLocation attribute in LdapEntry for HDFSProxy isn't specific to a cluster
documentLocation attribute in LdapEntry for HDFSProxy isn't specific to a cluster - Key: HDFS-1012 URL: https://issues.apache.org/jira/browse/HDFS-1012 Project: Hadoop HDFS Issue Type: Improvement Components: contrib/hdfsproxy Affects Versions: 0.20.1, 0.20.2, 0.21.0, 0.22.0 Reporter: Srikanth Sundarrajan List of allowed document locations accessible through HDFSProxy isn't specific to a cluster. LDAP entries can include the name of the cluster to which the path belongs to have better control on which clusters/paths are accessible through HDFSProxy by the user. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.