[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS
[ https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446892#comment-13446892 ] Hudson commented on HDFS-3880: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2697 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2697/]) HDFS-3880. Use Builder to build RPC server in HDFS. Contributed by Brandon Li. (Revision 1379917) Result = FAILURE suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379917 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/journalservice/JournalService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestClientProtocolWithDelegationToken.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java > Use Builder to get RPC server in HDFS > - > > Key: HDFS-3880 > URL: https://issues.apache.org/jira/browse/HDFS-3880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, ha, name-node, security >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-3880.patch > > > In HADOOP-8736, a Builder is introduced to replace all the getServer() > variants. This JIRA is the change in HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS
[ https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446891#comment-13446891 ] Hudson commented on HDFS-3880: -- Integrated in Hadoop-Common-trunk-Commit #2672 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2672/]) HDFS-3880. Use Builder to build RPC server in HDFS. Contributed by Brandon Li. (Revision 1379917) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379917 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/journalservice/JournalService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestClientProtocolWithDelegationToken.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java > Use Builder to get RPC server in HDFS > - > > Key: HDFS-3880 > URL: https://issues.apache.org/jira/browse/HDFS-3880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, ha, name-node, security >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-3880.patch > > > In HADOOP-8736, a Builder is introduced to replace all the getServer() > variants. This JIRA is the change in HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS
[ https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446890#comment-13446890 ] Hudson commented on HDFS-3880: -- Integrated in Hadoop-Hdfs-trunk-Commit #2735 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2735/]) HDFS-3880. Use Builder to build RPC server in HDFS. Contributed by Brandon Li. (Revision 1379917) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379917 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/journalservice/JournalService.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/TestClientProtocolWithDelegationToken.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/security/token/block/TestBlockToken.java > Use Builder to get RPC server in HDFS > - > > Key: HDFS-3880 > URL: https://issues.apache.org/jira/browse/HDFS-3880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, ha, name-node, security >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-3880.patch > > > In HADOOP-8736, a Builder is introduced to replace all the getServer() > variants. This JIRA is the change in HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3880) Use Builder to get RPC server in HDFS
[ https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Suresh Srinivas updated HDFS-3880: -- Resolution: Fixed Fix Version/s: 3.0.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed the change to trunk. Thank you Brandon. > Use Builder to get RPC server in HDFS > - > > Key: HDFS-3880 > URL: https://issues.apache.org/jira/browse/HDFS-3880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, ha, name-node, security >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-3880.patch > > > In HADOOP-8736, a Builder is introduced to replace all the getServer() > variants. This JIRA is the change in HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3880) Use Builder to get RPC server in HDFS
[ https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446884#comment-13446884 ] Suresh Srinivas commented on HDFS-3880: --- +1 for the patch. > Use Builder to get RPC server in HDFS > - > > Key: HDFS-3880 > URL: https://issues.apache.org/jira/browse/HDFS-3880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, ha, name-node, security >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Minor > Attachments: HDFS-3880.patch > > > In HADOOP-8736, a Builder is introduced to replace all the getServer() > variants. This JIRA is the change in HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?
[ https://issues.apache.org/jira/browse/HDFS-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446883#comment-13446883 ] Aaron T. Myers commented on HDFS-3886: -- Interesting idea. Perhaps we could add a "clean shutdown" dfsadmin command, and then add an extra action to the init.d script which a cautious admin can choose to run? That way we preserve the shutdown behavior that Steve is concerned about, but give the admin an option to have guaranteed-good metadata? Just thinking out loud. > Shutdown requests can possibly check for checkpoint issues (corrupted edits) > and save a good namespace copy before closing down? > > > Key: HDFS-3886 > URL: https://issues.apache.org/jira/browse/HDFS-3886 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Priority: Minor > > HDFS-3878 sorta gives me this idea. Aside of having a method to download it > to a different location, we can also lock up the namesystem (or deactivate > the client rpc server) and save the namesystem before we complete up the > shutdown. > The init.d/shutdown scripts would have to work with this somehow though, to > not kill -9 it when in-process. Also, the new image may be stored in a > shutdown.chkpt directory, to not interfere in the regular dirs, but still > allow easier recovery. > Obviously this will still not work if all directories are broken. So maybe we > could have some configs to tackle that as well? > I haven't thought this through, so let me know what part is wrong to do :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3880) Use Builder to get RPC server in HDFS
[ https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-3880: - Status: Patch Available (was: Open) > Use Builder to get RPC server in HDFS > - > > Key: HDFS-3880 > URL: https://issues.apache.org/jira/browse/HDFS-3880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, ha, name-node, security >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Minor > Attachments: HDFS-3880.patch > > > In HADOOP-8736, a Builder is introduced to replace all the getServer() > variants. This JIRA is the change in HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3880) Use Builder to get RPC server in HDFS
[ https://issues.apache.org/jira/browse/HDFS-3880?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Brandon Li updated HDFS-3880: - Status: Open (was: Patch Available) > Use Builder to get RPC server in HDFS > - > > Key: HDFS-3880 > URL: https://issues.apache.org/jira/browse/HDFS-3880 > Project: Hadoop HDFS > Issue Type: Improvement > Components: data-node, ha, name-node, security >Affects Versions: 3.0.0 >Reporter: Brandon Li >Assignee: Brandon Li >Priority: Minor > Attachments: HDFS-3880.patch > > > In HADOOP-8736, a Builder is introduced to replace all the getServer() > variants. This JIRA is the change in HDFS. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)
[ https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446862#comment-13446862 ] Hadoop QA commented on HDFS-3876: - +1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543455/hdfs-3876.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs. +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3138//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3138//console This message is automatically generated. > NN should not RPC to self to find trash defaults (causes deadlock) > -- > > Key: HDFS-3876 > URL: https://issues.apache.org/jira/browse/HDFS-3876 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.2.0-alpha >Reporter: Todd Lipcon >Assignee: Eli Collins >Priority: Blocker > Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt > > > When transitioning a SBN to active, I ran into the following situation: > - the TrashPolicy first gets loaded by an IPC Server Handler thread. The > {{initialize}} function then tries to make an RPC to the same node to find > out the defaults. > - This is happening inside the NN write lock (since it's part of the active > initialization). Hence, all of the other handler threads are already blocked > waiting to get the NN lock. > - Since no handler threads are free, the RPC blocks forever and the NN never > enters active state. > We need to have a general policy that the NN should never make RPCs to itself > for any reason, due to potential for deadlocks like this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3828) Block Scanner rescans blocks too frequently
[ https://issues.apache.org/jira/browse/HDFS-3828?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446861#comment-13446861 ] Eli Collins commented on HDFS-3828: --- Agree this approach is best for now. Please file a jira for the proposed refactoring outlining the issues it addresses with the current approach (eg that we do unnecessary work if we finish the scan w/in the period). - Per the findbugs warnning I'd pull your new check in scanBlockPoolSlice out to a synchronized method (eg workRemainingInCurrentPeriod) - DataBlockScanner#run should use SLEEP_PERIOD_MS (could use in getNextBPScanner as well, though it and waitForInit aren't part of the "period") - In getTotalScans rather than throw IOE if given a bpid w/o a scanner I believe this should be an assert (we should always have a scanner for a block pool if we've enbabled scanning, which we have if we're in DataBlockScanner) > Block Scanner rescans blocks too frequently > --- > > Key: HDFS-3828 > URL: https://issues.apache.org/jira/browse/HDFS-3828 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 0.23.0, 2.0.0-alpha >Reporter: Andy Isaacson >Assignee: Andy Isaacson > Attachments: hdfs-3828-1.txt, hdfs-3828-2.txt, hdfs3828.txt > > > {{BlockPoolSliceScanner#scan}} calls cleanUp every time it's invoked from > {{DataBlockScanner#run}} via {{scanBlockPoolSlice}}. But cleanUp > unconditionally roll()s the verificationLogs, so after two iterations we have > lost the first iteration of block verification times. As a result a cluster > with just one block repeatedly rescans it every 10 seconds: > {noformat} > 2012-08-16 15:59:57,884 INFO datanode.BlockPoolSliceScanner > (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for > BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 > 2012-08-16 16:00:07,904 INFO datanode.BlockPoolSliceScanner > (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for > BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 > 2012-08-16 16:00:17,925 INFO datanode.BlockPoolSliceScanner > (BlockPoolSliceScanner.java:verifyBlock(391)) - Verification succeeded for > BP-2101131164-172.29.122.91-1337906886255:blk_7919273167187535506_4915 > {noformat} > {quote} > To fix this, we need to avoid roll()ing the logs multiple times per period. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)
[ https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446858#comment-13446858 ] Hadoop QA commented on HDFS-3876: - -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12543455/hdfs-3876.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified test files. +1 javac. The applied patch does not increase the total number of javac compiler warnings. +1 javadoc. The javadoc tool did not generate any warning messages. +1 eclipse:eclipse. The patch built with eclipse:eclipse. +1 findbugs. The patch does not introduce any new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests in hadoop-common-project/hadoop-common hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks +1 contrib tests. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/3137//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/3137//console This message is automatically generated. > NN should not RPC to self to find trash defaults (causes deadlock) > -- > > Key: HDFS-3876 > URL: https://issues.apache.org/jira/browse/HDFS-3876 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.2.0-alpha >Reporter: Todd Lipcon >Assignee: Eli Collins >Priority: Blocker > Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt > > > When transitioning a SBN to active, I ran into the following situation: > - the TrashPolicy first gets loaded by an IPC Server Handler thread. The > {{initialize}} function then tries to make an RPC to the same node to find > out the defaults. > - This is happening inside the NN write lock (since it's part of the active > initialization). Hence, all of the other handler threads are already blocked > waiting to get the NN lock. > - Since no handler threads are free, the RPC blocks forever and the NN never > enters active state. > We need to have a general policy that the NN should never make RPCs to itself > for any reason, due to potential for deadlocks like this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3870) QJM: add metrics to JournalNode
[ https://issues.apache.org/jira/browse/HDFS-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446854#comment-13446854 ] Eli Collins commented on HDFS-3870: --- +1 looks great. Sync and lag are IMO the most interesting. Only other useful metrics I can think of are additions to the NN's JN metrics (NN -> Journal latency and failed Journal operations) though those aren't QJM specific. > QJM: add metrics to JournalNode > --- > > Key: HDFS-3870 > URL: https://issues.apache.org/jira/browse/HDFS-3870 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: QuorumJournalManager (HDFS-3077) >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: hdfs-3870.txt > > > The JournalNode should expose some basic metrics through the usual interface. > In particular: > - the writer epoch, accepted epoch, > - the last written transaction ID and last committed txid (which may be newer > in case that it's in the process of catching up) > - latency information for how long the syncs are taking > Please feel free to suggest others that come to mind. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3884) QJM: Journal format() should reset cached values
[ https://issues.apache.org/jira/browse/HDFS-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446853#comment-13446853 ] Eli Collins commented on HDFS-3884: --- +1 lgtm > QJM: Journal format() should reset cached values > > > Key: HDFS-3884 > URL: https://issues.apache.org/jira/browse/HDFS-3884 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: QuorumJournalManager (HDFS-3077) >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: QuorumJournalManager (HDFS-3077) > > Attachments: hdfs-3884.txt > > > Simple bug in the JournalNode: it caches certain values (eg accepted epoch) > in memory, and the cached values aren't reset when the journal is formatted. > So, after a format, further calls to the same Journal will see the old value > for accepted epoch, writer epoch, etc, preventing the journal from being > re-used until the JN is restarted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3863) QJM: track last "committed" txid
[ https://issues.apache.org/jira/browse/HDFS-3863?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446852#comment-13446852 ] Eli Collins commented on HDFS-3863: --- Agree w you and Chao Shi, nice change to the protocol. Consider making committedTxId and lastCommittedTxId non-optional? Why not use INVALID_TXID rather than 0 as a default value in the file and protocol for tracking the committed txid? > QJM: track last "committed" txid > > > Key: HDFS-3863 > URL: https://issues.apache.org/jira/browse/HDFS-3863 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha >Affects Versions: QuorumJournalManager (HDFS-3077) >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: hdfs-3863-prelim.txt, hdfs-3863.txt > > > Per some discussion with [~stepinto] > [here|https://issues.apache.org/jira/browse/HDFS-3077?focusedCommentId=13422579&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13422579], > we should keep track of the "last committed txid" on each JournalNode. Then > during any recovery operation, we can sanity-check that we aren't asked to > truncate a log to an earlier transaction. > This is also a necessary step if we want to support reading from in-progress > segments in the future (since we should only allow reads up to the commit > point) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3869) QJM: expose non-file journal manager details in web UI
[ https://issues.apache.org/jira/browse/HDFS-3869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446847#comment-13446847 ] Eli Collins commented on HDFS-3869: --- +1 looks great I'd consider mentioning in a comment above journals that is COW because even though FSEdit* is synchronized these uses may race with the Web UI (my understanding of why it is now a CopyOnWriteArrayList). > QJM: expose non-file journal manager details in web UI > -- > > Key: HDFS-3869 > URL: https://issues.apache.org/jira/browse/HDFS-3869 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Affects Versions: QuorumJournalManager (HDFS-3077) >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: dir-failed.png, hdfs-3869.txt, hdfs-3869.txt, > lagging-jn.png, open-for-read.png, open-for-write.png > > > Currently, the NN web UI only contains NN storage directories on local disk. > It should also include details about any non-file JournalManagers in use. > This JIRA targets the QJM branch, but will be useful for BKJM as well. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)
[ https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3876: -- Attachment: hdfs-3876.txt > NN should not RPC to self to find trash defaults (causes deadlock) > -- > > Key: HDFS-3876 > URL: https://issues.apache.org/jira/browse/HDFS-3876 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.2.0-alpha >Reporter: Todd Lipcon >Assignee: Eli Collins >Priority: Blocker > Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt > > > When transitioning a SBN to active, I ran into the following situation: > - the TrashPolicy first gets loaded by an IPC Server Handler thread. The > {{initialize}} function then tries to make an RPC to the same node to find > out the defaults. > - This is happening inside the NN write lock (since it's part of the active > initialization). Hence, all of the other handler threads are already blocked > waiting to get the NN lock. > - Since no handler threads are free, the RPC blocks forever and the NN never > enters active state. > We need to have a general policy that the NN should never make RPCs to itself > for any reason, due to potential for deadlocks like this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)
[ https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3876: -- Attachment: (was: hdfs-3876.txt) > NN should not RPC to self to find trash defaults (causes deadlock) > -- > > Key: HDFS-3876 > URL: https://issues.apache.org/jira/browse/HDFS-3876 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.2.0-alpha >Reporter: Todd Lipcon >Assignee: Eli Collins >Priority: Blocker > Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt > > > When transitioning a SBN to active, I ran into the following situation: > - the TrashPolicy first gets loaded by an IPC Server Handler thread. The > {{initialize}} function then tries to make an RPC to the same node to find > out the defaults. > - This is happening inside the NN write lock (since it's part of the active > initialization). Hence, all of the other handler threads are already blocked > waiting to get the NN lock. > - Since no handler threads are free, the RPC blocks forever and the NN never > enters active state. > We need to have a general policy that the NN should never make RPCs to itself > for any reason, due to potential for deadlocks like this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3876) NN should not RPC to self to find trash defaults (causes deadlock)
[ https://issues.apache.org/jira/browse/HDFS-3876?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eli Collins updated HDFS-3876: -- Attachment: hdfs-3876.txt TestViewFsTrash failed because the test deletes "/" and we now get server defaults on the path (to get the trash configuration), which fails for viewfs for "/" because "/" is not associated with a file system and we fail due to a NotInMoutPointException. Updated patch, catch Exception rather than IOE when obtaining server defaults so we fail the delete when we fail to get server defaults (rather than potentially ignoring the server trash configuration for transient errors getting server defaults) and updated TestTrash to not fail the test when we fail due to obtaining server defaults (which is what should happen in the viewfs case). > NN should not RPC to self to find trash defaults (causes deadlock) > -- > > Key: HDFS-3876 > URL: https://issues.apache.org/jira/browse/HDFS-3876 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.2.0-alpha >Reporter: Todd Lipcon >Assignee: Eli Collins >Priority: Blocker > Attachments: hdfs-3876.txt, hdfs-3876.txt, hdfs-3876.txt > > > When transitioning a SBN to active, I ran into the following situation: > - the TrashPolicy first gets loaded by an IPC Server Handler thread. The > {{initialize}} function then tries to make an RPC to the same node to find > out the defaults. > - This is happening inside the NN write lock (since it's part of the active > initialization). Hence, all of the other handler threads are already blocked > waiting to get the NN lock. > - Since no handler threads are free, the RPC blocks forever and the NN never > enters active state. > We need to have a general policy that the NN should never make RPCs to itself > for any reason, due to potential for deadlocks like this. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.
[ https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446788#comment-13446788 ] Hudson commented on HDFS-2580: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2696 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2696/]) HDFS-2580. NameNode#main(...) can make use of GenericOptionsParser. Contributed by harsh. (harsh) (Revision 1379828) Result = FAILURE harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379828 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java > NameNode#main(...) can make use of GenericOptionsParser. > > > Key: HDFS-2580 > URL: https://issues.apache.org/jira/browse/HDFS-2580 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-2580.patch > > > DataNode supports passing generic opts when calling via {{hdfs datanode}}. > NameNode can support the same thing as well, but doesn't right now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.
[ https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446785#comment-13446785 ] Hudson commented on HDFS-2580: -- Integrated in Hadoop-Hdfs-trunk-Commit #2734 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2734/]) HDFS-2580. NameNode#main(...) can make use of GenericOptionsParser. Contributed by harsh. (harsh) (Revision 1379828) Result = SUCCESS harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379828 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java > NameNode#main(...) can make use of GenericOptionsParser. > > > Key: HDFS-2580 > URL: https://issues.apache.org/jira/browse/HDFS-2580 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-2580.patch > > > DataNode supports passing generic opts when calling via {{hdfs datanode}}. > NameNode can support the same thing as well, but doesn't right now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.
[ https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446784#comment-13446784 ] Hudson commented on HDFS-2580: -- Integrated in Hadoop-Common-trunk-Commit #2671 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2671/]) HDFS-2580. NameNode#main(...) can make use of GenericOptionsParser. Contributed by harsh. (harsh) (Revision 1379828) Result = SUCCESS harsh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379828 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNode.java > NameNode#main(...) can make use of GenericOptionsParser. > > > Key: HDFS-2580 > URL: https://issues.apache.org/jira/browse/HDFS-2580 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-2580.patch > > > DataNode supports passing generic opts when calling via {{hdfs datanode}}. > NameNode can support the same thing as well, but doesn't right now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-2580) NameNode#main(...) can make use of GenericOptionsParser.
[ https://issues.apache.org/jira/browse/HDFS-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Harsh J updated HDFS-2580: -- Resolution: Fixed Fix Version/s: 3.0.0 Target Version/s: (was: 3.0.0) Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks Eli. Rebased and committed to trunk as r1379828. > NameNode#main(...) can make use of GenericOptionsParser. > > > Key: HDFS-2580 > URL: https://issues.apache.org/jira/browse/HDFS-2580 > Project: Hadoop HDFS > Issue Type: Improvement > Components: name-node >Affects Versions: 0.23.0 >Reporter: Harsh J >Assignee: Harsh J >Priority: Minor > Fix For: 3.0.0 > > Attachments: HDFS-2580.patch > > > DataNode supports passing generic opts when calling via {{hdfs datanode}}. > NameNode can support the same thing as well, but doesn't right now. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3854) Implement a fence method which should fence the BK shared storage.
[ https://issues.apache.org/jira/browse/HDFS-3854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446778#comment-13446778 ] Uma Maheswara Rao G commented on HDFS-3854: --- Per the discussion in HDFS-3862, we may disable the fencing option with single writer storage. > Implement a fence method which should fence the BK shared storage. > -- > > Key: HDFS-3854 > URL: https://issues.apache.org/jira/browse/HDFS-3854 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: name-node >Reporter: Uma Maheswara Rao G > > Currently when machine down or network down, SSHFence can not ensure that, > other node is completely down. So, fence will fail and switch will not happen. > [ internally we did work around to return true when machine is not reachable, > as BKJM already has fencing] > It may be good idea to implement a fence method, which should ensure shared > storage fenced propertly and return true. > We can plug in this new method in ZKFC fence methods. > only pain points what I can see is, we may have to put the BKJM jar in ZKFC > lib for running this fence method. > thoughts? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3862) QJM: don't require a fencer to be configured if shared storage has built-in single-writer semantics
[ https://issues.apache.org/jira/browse/HDFS-3862?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446777#comment-13446777 ] Uma Maheswara Rao G commented on HDFS-3862: --- Todd, It seems like reasonable to me. I also filed one JIRA to handle this situation with single writer HDFS-3854. But I thought, we could simply provide a fence method which will fence the writer, that means that we have guaranteed that no other NN can access shared storage and then go for state change. In fact if we are ok with leaving the fence to writer level, that is more good. Currently also simply we have a dummy fence method, which will return true as BK already has fencing. >From above suggestion, adding API in JournalManager, it may require to >creating the JournalManager for getting this info in ZKFC right? How about simply adding one config parameter? > QJM: don't require a fencer to be configured if shared storage has built-in > single-writer semantics > --- > > Key: HDFS-3862 > URL: https://issues.apache.org/jira/browse/HDFS-3862 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha >Affects Versions: QuorumJournalManager (HDFS-3077) >Reporter: Todd Lipcon > > Currently, NN HA requires that the administrator configure a fencing method > to ensure that only a single NameNode may write to the shared storage at a > time. Some shared edits storage implementations (like QJM) inherently enforce > single-writer semantics at the storage level, and thus the user should not be > forced to specify one. > We should extend the JournalManager interface so that the HA code can operate > without a configured fencer if the JM has such built-in fencing. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
Re: NativeS3FileSystem problem
Any comment on this? On Aug 28, 2012, at 11:43 PM, Chris Collins wrote: > I was attempting to use the natives3 file system outside of doing any map > reduce tasks. A simple task of trying to create a directory: > > > FileSystem fs = FileSystem.get(uri, conf); > Path currPath = new Path("/a/b/c"); > fs.mkdirs(currPath); > > ( I can provide full code if needed). > > Anyway the class Jets3tNativeFileSystemStore attempts to detect if each key > part of the object path exists expecting a 404 response if it does not: > > public FileMetadata retrieveMetadata(String key) throws IOException { > try { > S3Object object = s3Service.getObjectDetails(bucket, key); > return new FileMetadata(key, object.getContentLength(), > object.getLastModifiedDate().getTime()); > } catch (S3ServiceException e) { > // Following is brittle. Is there a better way? > if (e.getMessage().contains("ResponseCode=404")) { > return null; > } > if (e.getCause() instanceof IOException) { > throw (IOException) e.getCause(); > } > throw new S3Exception(e); > } > } > > All version of jets3 I have looked at that seem to have a compatible class > structure (don't blow on AWSCredentials) actually return an exception > containing ".ResponseCode: 404 > > I took a copy of the code in this directory and fixed the following to read: > > public FileMetadata retrieveMetadata(String key) throws IOException { > try { > S3Object object = s3Service.getObjectDetails(bucket, key); > return new FileMetadata(key, object.getContentLength(), > object.getLastModifiedDate().getTime()); > } catch (S3ServiceException e) { > // Following is brittle. Is there a better way? > if (e.getResponseCode() == 404) { > return null; > } > if (e.getCause() instanceof IOException) { > throw (IOException) e.getCause(); > } > throw new S3Exception(e); > } > } > > which seems to fix the issue. Am I missing something? Also this seems to of > been broken for a variety of hadoop versions. Does anyone actually use this > code path and if so is there a valid version combination that should of > worked for me? > > Comments welcome. > > Chris
[jira] [Commented] (HDFS-2434) TestNameNodeMetrics.testCorruptBlock fails intermittently
[ https://issues.apache.org/jira/browse/HDFS-2434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446737#comment-13446737 ] Kihwal Lee commented on HDFS-2434: -- The test case fails this way when the corrupt replica is fixed right away before gathering namenode metrics. In one example, computeReplicationWorkForBlocks() was done within 10ms of the block corruption and the datanode did heartbeat in 380ms. The block corruption was resolved completely in 13ms after that. Since replication monitor and dn heartbeats are asynchronous, the current way of sleeping for 1 sec is not a reliable way to hit a moment between the two. > TestNameNodeMetrics.testCorruptBlock fails intermittently > - > > Key: HDFS-2434 > URL: https://issues.apache.org/jira/browse/HDFS-2434 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Reporter: Uma Maheswara Rao G > > java.lang.AssertionError: Bad value for metric CorruptBlocks expected:<1> but > was:<0> > at org.junit.Assert.fail(Assert.java:91) > at org.junit.Assert.failNotEquals(Assert.java:645) > at org.junit.Assert.assertEquals(Assert.java:126) > at org.junit.Assert.assertEquals(Assert.java:470) > at > org.apache.hadoop.test.MetricsAsserts.assertGauge(MetricsAsserts.java:185) > at > org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.__CLR3_0_2t8sh531i1k(TestNameNodeMetrics.java:175) > at > org.apache.hadoop.hdfs.server.namenode.metrics.TestNameNodeMetrics.testCorruptBlock(TestNameNodeMetrics.java:164) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) > at java.lang.reflect.Method.invoke(Method.java:597) > at junit.framework.TestCase.runTest(TestCase.java:168) > at junit.framework.TestCase.runBare(TestCase.java:134) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3466) The SPNEGO filter for the NameNode should come out of the web keytab file
[ https://issues.apache.org/jira/browse/HDFS-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446718#comment-13446718 ] Hudson commented on HDFS-3466: -- Integrated in Hadoop-Mapreduce-trunk #1183 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/]) HDFS-3466. Get HTTP kerberos principal from the web authentication keytab. (omalley) (Revision 1379646) Result = SUCCESS omalley : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379646 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java > The SPNEGO filter for the NameNode should come out of the web keytab file > - > > Key: HDFS-3466 > URL: https://issues.apache.org/jira/browse/HDFS-3466 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node, security >Affects Versions: 1.1.0, 2.0.0-alpha >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 1.1.0, 2.1.0-alpha > > Attachments: hdfs-3466-b1-2.patch, hdfs-3466-b1.patch, > hdfs-3466-trunk-2.patch, hdfs-3466-trunk-3.patch > > > Currently, the spnego filter uses the DFS_NAMENODE_KEYTAB_FILE_KEY to find > the keytab. It should use the DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY to > do it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3852) TestHftpDelegationToken is broken after HADOOP-8225
[ https://issues.apache.org/jira/browse/HDFS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446715#comment-13446715 ] Hudson commented on HDFS-3852: -- Integrated in Hadoop-Mapreduce-trunk #1183 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/]) HDFS-3852. TestHftpDelegationToken is broken after HADOOP-8225 (daryn) (Revision 1379623) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379623 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java > TestHftpDelegationToken is broken after HADOOP-8225 > --- > > Key: HDFS-3852 > URL: https://issues.apache.org/jira/browse/HDFS-3852 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client, security >Affects Versions: 0.23.3, 2.1.0-alpha >Reporter: Aaron T. Myers >Assignee: Daryn Sharp > Fix For: 0.23.3, 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3852.patch > > > It's been failing in all builds for the last 2 days or so. Git bisect > indicates that it's due to HADOOP-8225. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails
[ https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446713#comment-13446713 ] Hudson commented on HDFS-3873: -- Integrated in Hadoop-Mapreduce-trunk #1183 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/]) HDFS-3873. Hftp assumes security is disabled if token fetch fails (daryn) (Revision 1379615) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379615 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java > Hftp assumes security is disabled if token fetch fails > -- > > Key: HDFS-3873 > URL: https://issues.apache.org/jira/browse/HDFS-3873 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 0.23.3, 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch > > > Hftp ignores all exceptions generated while trying to get a token, based on > the assumption that it means security is disabled. Debugging problems is > excruciatingly difficult when security is enabled but something goes wrong. > Job submissions succeed, but tasks fail because the NN rejects the user as > unauthenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748
[ https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446710#comment-13446710 ] Hudson commented on HDFS-3871: -- Integrated in Hadoop-Mapreduce-trunk #1183 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/]) HDFS-3871. Change NameNodeProxies to use RetryUtils. Contributed by Arun C Murthy (Revision 1379743) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > Change NameNodeProxies to use HADOOP-8748 > - > > Key: HDFS-3871 > URL: https://issues.apache.org/jira/browse/HDFS-3871 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Arun C Murthy >Assignee: Arun C Murthy >Priority: Minor > Fix For: 1.2.0, 2.2.0-alpha > > Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch > > > Change NameNodeProxies to use util method introduced via HADOOP-8748. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3833) TestDFSShell fails on Windows due to file concurrent read write
[ https://issues.apache.org/jira/browse/HDFS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446706#comment-13446706 ] Hudson commented on HDFS-3833: -- Integrated in Hadoop-Mapreduce-trunk #1183 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1183/]) HDFS-3833. TestDFSShell fails on windows due to concurrent file read/write. Contributed by Brandon Li (Revision 1379525) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379525 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java > TestDFSShell fails on Windows due to file concurrent read write > --- > > Key: HDFS-3833 > URL: https://issues.apache.org/jira/browse/HDFS-3833 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 1-win >Reporter: Brandon Li >Assignee: Brandon Li > Fix For: 3.0.0, 1-win, 2.2.0-alpha > > Attachments: HDFS-3833.branch-1-win.patch, HDFS-3833.patch > > > TestDFSShell sometimes fails due to the race between the write issued by the > test and blockscanner. Example stack trace: > {noformat} > Error Message > c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The > requested operation cannot be performed on a file with a user-mapped section > open) > Stacktrace > java.io.FileNotFoundException: > c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The > requested operation cannot be performed on a file with a user-mapped section > open) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:194) > at java.io.FileOutputStream.(FileOutputStream.java:145) > at java.io.PrintWriter.(PrintWriter.java:218) > at org.apache.hadoop.hdfs.TestDFSShell.corrupt(TestDFSShell.java:1133) > at org.apache.hadoop.hdfs.TestDFSShell.testGet(TestDFSShell.java:1231) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3466) The SPNEGO filter for the NameNode should come out of the web keytab file
[ https://issues.apache.org/jira/browse/HDFS-3466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446690#comment-13446690 ] Hudson commented on HDFS-3466: -- Integrated in Hadoop-Hdfs-trunk #1152 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/]) HDFS-3466. Get HTTP kerberos principal from the web authentication keytab. (omalley) (Revision 1379646) Result = SUCCESS omalley : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379646 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeHttpServer.java > The SPNEGO filter for the NameNode should come out of the web keytab file > - > > Key: HDFS-3466 > URL: https://issues.apache.org/jira/browse/HDFS-3466 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node, security >Affects Versions: 1.1.0, 2.0.0-alpha >Reporter: Owen O'Malley >Assignee: Owen O'Malley > Fix For: 1.1.0, 2.1.0-alpha > > Attachments: hdfs-3466-b1-2.patch, hdfs-3466-b1.patch, > hdfs-3466-trunk-2.patch, hdfs-3466-trunk-3.patch > > > Currently, the spnego filter uses the DFS_NAMENODE_KEYTAB_FILE_KEY to find > the keytab. It should use the DFS_WEB_AUTHENTICATION_KERBEROS_KEYTAB_KEY to > do it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3852) TestHftpDelegationToken is broken after HADOOP-8225
[ https://issues.apache.org/jira/browse/HDFS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446687#comment-13446687 ] Hudson commented on HDFS-3852: -- Integrated in Hadoop-Hdfs-trunk #1152 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/]) HDFS-3852. TestHftpDelegationToken is broken after HADOOP-8225 (daryn) (Revision 1379623) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379623 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java > TestHftpDelegationToken is broken after HADOOP-8225 > --- > > Key: HDFS-3852 > URL: https://issues.apache.org/jira/browse/HDFS-3852 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client, security >Affects Versions: 0.23.3, 2.1.0-alpha >Reporter: Aaron T. Myers >Assignee: Daryn Sharp > Fix For: 0.23.3, 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3852.patch > > > It's been failing in all builds for the last 2 days or so. Git bisect > indicates that it's due to HADOOP-8225. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails
[ https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446685#comment-13446685 ] Hudson commented on HDFS-3873: -- Integrated in Hadoop-Hdfs-trunk #1152 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/]) HDFS-3873. Hftp assumes security is disabled if token fetch fails (daryn) (Revision 1379615) Result = SUCCESS daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379615 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java > Hftp assumes security is disabled if token fetch fails > -- > > Key: HDFS-3873 > URL: https://issues.apache.org/jira/browse/HDFS-3873 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 0.23.3, 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch > > > Hftp ignores all exceptions generated while trying to get a token, based on > the assumption that it means security is disabled. Debugging problems is > excruciatingly difficult when security is enabled but something goes wrong. > Job submissions succeed, but tasks fail because the NN rejects the user as > unauthenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3833) TestDFSShell fails on Windows due to file concurrent read write
[ https://issues.apache.org/jira/browse/HDFS-3833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446682#comment-13446682 ] Hudson commented on HDFS-3833: -- Integrated in Hadoop-Hdfs-trunk #1152 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1152/]) HDFS-3833. TestDFSShell fails on windows due to concurrent file read/write. Contributed by Brandon Li (Revision 1379525) Result = SUCCESS suresh : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379525 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDFSShell.java > TestDFSShell fails on Windows due to file concurrent read write > --- > > Key: HDFS-3833 > URL: https://issues.apache.org/jira/browse/HDFS-3833 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 3.0.0, 1-win >Reporter: Brandon Li >Assignee: Brandon Li > Fix For: 3.0.0, 1-win, 2.2.0-alpha > > Attachments: HDFS-3833.branch-1-win.patch, HDFS-3833.patch > > > TestDFSShell sometimes fails due to the race between the write issued by the > test and blockscanner. Example stack trace: > {noformat} > Error Message > c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The > requested operation cannot be performed on a file with a user-mapped section > open) > Stacktrace > java.io.FileNotFoundException: > c:\A\HM\build\test\data\dfs\data\data1\current\blk_-7735708801221347790 (The > requested operation cannot be performed on a file with a user-mapped section > open) > at java.io.FileOutputStream.open(Native Method) > at java.io.FileOutputStream.(FileOutputStream.java:194) > at java.io.FileOutputStream.(FileOutputStream.java:145) > at java.io.PrintWriter.(PrintWriter.java:218) > at org.apache.hadoop.hdfs.TestDFSShell.corrupt(TestDFSShell.java:1133) > at org.apache.hadoop.hdfs.TestDFSShell.testGet(TestDFSShell.java:1231) > {noformat} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748
[ https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446676#comment-13446676 ] Hudson commented on HDFS-3871: -- Integrated in Hadoop-Mapreduce-trunk-Commit #2695 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Commit/2695/]) HDFS-3871. Change NameNodeProxies to use RetryUtils. Contributed by Arun C Murthy (Revision 1379743) Result = FAILURE szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > Change NameNodeProxies to use HADOOP-8748 > - > > Key: HDFS-3871 > URL: https://issues.apache.org/jira/browse/HDFS-3871 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Arun C Murthy >Assignee: Arun C Murthy >Priority: Minor > Fix For: 1.2.0, 2.2.0-alpha > > Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch > > > Change NameNodeProxies to use util method introduced via HADOOP-8748. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3873) Hftp assumes security is disabled if token fetch fails
[ https://issues.apache.org/jira/browse/HDFS-3873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446669#comment-13446669 ] Hudson commented on HDFS-3873: -- Integrated in Hadoop-Hdfs-0.23-Build #361 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/361/]) HDFS-3873. Hftp assumes security is disabled if token fetch fails (daryn) (Revision 1379620) Result = UNSTABLE daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379620 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/HftpFileSystem.java * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java > Hftp assumes security is disabled if token fetch fails > -- > > Key: HDFS-3873 > URL: https://issues.apache.org/jira/browse/HDFS-3873 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client >Affects Versions: 0.23.3, 3.0.0, 2.2.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 0.23.3, 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3873.branch-23.patch, HDFS-3873.patch > > > Hftp ignores all exceptions generated while trying to get a token, based on > the assumption that it means security is disabled. Debugging problems is > excruciatingly difficult when security is enabled but something goes wrong. > Job submissions succeed, but tasks fail because the NN rejects the user as > unauthenticated. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3852) TestHftpDelegationToken is broken after HADOOP-8225
[ https://issues.apache.org/jira/browse/HDFS-3852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446672#comment-13446672 ] Hudson commented on HDFS-3852: -- Integrated in Hadoop-Hdfs-0.23-Build #361 (See [https://builds.apache.org/job/Hadoop-Hdfs-0.23-Build/361/]) svn merge -c 1379623 FIXES: HDFS-3852. TestHftpDelegationToken is broken after HADOOP-8225 (daryn) (Revision 1379627) Result = UNSTABLE daryn : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379627 Files : * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/branches/branch-0.23/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestHftpDelegationToken.java > TestHftpDelegationToken is broken after HADOOP-8225 > --- > > Key: HDFS-3852 > URL: https://issues.apache.org/jira/browse/HDFS-3852 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs client, security >Affects Versions: 0.23.3, 2.1.0-alpha >Reporter: Aaron T. Myers >Assignee: Daryn Sharp > Fix For: 0.23.3, 3.0.0, 2.2.0-alpha > > Attachments: HDFS-3852.patch > > > It's been failing in all builds for the last 2 days or so. Git bisect > indicates that it's due to HADOOP-8225. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?
[ https://issues.apache.org/jira/browse/HDFS-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446659#comment-13446659 ] Harsh J commented on HDFS-3886: --- Thanks much Steve. Perhaps instead can we have the shutdown scripts call a savenamespace pre signal? That way we sorta achieve the same? > Shutdown requests can possibly check for checkpoint issues (corrupted edits) > and save a good namespace copy before closing down? > > > Key: HDFS-3886 > URL: https://issues.apache.org/jira/browse/HDFS-3886 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Priority: Minor > > HDFS-3878 sorta gives me this idea. Aside of having a method to download it > to a different location, we can also lock up the namesystem (or deactivate > the client rpc server) and save the namesystem before we complete up the > shutdown. > The init.d/shutdown scripts would have to work with this somehow though, to > not kill -9 it when in-process. Also, the new image may be stored in a > shutdown.chkpt directory, to not interfere in the regular dirs, but still > allow easier recovery. > Obviously this will still not work if all directories are broken. So maybe we > could have some configs to tackle that as well? > I haven't thought this through, so let me know what part is wrong to do :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748
[ https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446656#comment-13446656 ] Hudson commented on HDFS-3871: -- Integrated in Hadoop-Common-trunk-Commit #2670 (See [https://builds.apache.org/job/Hadoop-Common-trunk-Commit/2670/]) HDFS-3871. Change NameNodeProxies to use RetryUtils. Contributed by Arun C Murthy (Revision 1379743) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > Change NameNodeProxies to use HADOOP-8748 > - > > Key: HDFS-3871 > URL: https://issues.apache.org/jira/browse/HDFS-3871 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Arun C Murthy >Assignee: Arun C Murthy >Priority: Minor > Fix For: 1.2.0, 2.2.0-alpha > > Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch > > > Change NameNodeProxies to use util method introduced via HADOOP-8748. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748
[ https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446658#comment-13446658 ] Hudson commented on HDFS-3871: -- Integrated in Hadoop-Hdfs-trunk-Commit #2733 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Commit/2733/]) HDFS-3871. Change NameNodeProxies to use RetryUtils. Contributed by Arun C Murthy (Revision 1379743) Result = SUCCESS szetszwo : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1379743 Files : * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/NameNodeProxies.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/web/WebHdfsFileSystem.java > Change NameNodeProxies to use HADOOP-8748 > - > > Key: HDFS-3871 > URL: https://issues.apache.org/jira/browse/HDFS-3871 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Arun C Murthy >Assignee: Arun C Murthy >Priority: Minor > Fix For: 1.2.0, 2.2.0-alpha > > Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch > > > Change NameNodeProxies to use util method introduced via HADOOP-8748. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3871) Change NameNodeProxies to use HADOOP-8748
[ https://issues.apache.org/jira/browse/HDFS-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tsz Wo (Nicholas), SZE updated HDFS-3871: - Resolution: Fixed Fix Version/s: 2.2.0-alpha 1.2.0 Status: Resolved (was: Patch Available) No problem. I have committed this. Thanks, Arun! > Change NameNodeProxies to use HADOOP-8748 > - > > Key: HDFS-3871 > URL: https://issues.apache.org/jira/browse/HDFS-3871 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs client >Reporter: Arun C Murthy >Assignee: Arun C Murthy >Priority: Minor > Fix For: 1.2.0, 2.2.0-alpha > > Attachments: HDFS-3781_branch1.patch, HDFS-3781.patch > > > Change NameNodeProxies to use util method introduced via HADOOP-8748. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?
[ https://issues.apache.org/jira/browse/HDFS-3886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13446653#comment-13446653 ] Steve Loughran commented on HDFS-3886: -- currently a kill -2 is the event sent from init.d to trigger a managed shutdown, but it needs to complete within a bounded period, otherwise robust init.d/ linux HA scripts will escalate to a -9; this is because they need to reliably shut down the system. Any change that reverts service scripts from having timeout+escalation would be counterproductive from a service management perspective. Now, if there were another signal handler that triggered lock up and system save, that could be good -but that would lie outside init.d land > Shutdown requests can possibly check for checkpoint issues (corrupted edits) > and save a good namespace copy before closing down? > > > Key: HDFS-3886 > URL: https://issues.apache.org/jira/browse/HDFS-3886 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node >Affects Versions: 2.0.0-alpha >Reporter: Harsh J >Priority: Minor > > HDFS-3878 sorta gives me this idea. Aside of having a method to download it > to a different location, we can also lock up the namesystem (or deactivate > the client rpc server) and save the namesystem before we complete up the > shutdown. > The init.d/shutdown scripts would have to work with this somehow though, to > not kill -9 it when in-process. Also, the new image may be stored in a > shutdown.chkpt directory, to not interfere in the regular dirs, but still > allow easier recovery. > Obviously this will still not work if all directories are broken. So maybe we > could have some configs to tackle that as well? > I haven't thought this through, so let me know what part is wrong to do :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3886) Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down?
Harsh J created HDFS-3886: - Summary: Shutdown requests can possibly check for checkpoint issues (corrupted edits) and save a good namespace copy before closing down? Key: HDFS-3886 URL: https://issues.apache.org/jira/browse/HDFS-3886 Project: Hadoop HDFS Issue Type: Bug Components: name-node Affects Versions: 2.0.0-alpha Reporter: Harsh J Priority: Minor HDFS-3878 sorta gives me this idea. Aside of having a method to download it to a different location, we can also lock up the namesystem (or deactivate the client rpc server) and save the namesystem before we complete up the shutdown. The init.d/shutdown scripts would have to work with this somehow though, to not kill -9 it when in-process. Also, the new image may be stored in a shutdown.chkpt directory, to not interfere in the regular dirs, but still allow easier recovery. Obviously this will still not work if all directories are broken. So maybe we could have some configs to tackle that as well? I haven't thought this through, so let me know what part is wrong to do :) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3885) QJM: optimize log sync when JN is lagging behind
Todd Lipcon created HDFS-3885: - Summary: QJM: optimize log sync when JN is lagging behind Key: HDFS-3885 URL: https://issues.apache.org/jira/browse/HDFS-3885 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon This is a potential optimization that we can add to the JournalNode: when one of the nodes is lagging behind the others (eg because its local disk is slower or there was a network blip), it receives edits after they've been committed to a majority. It can tell this because the committed txid included in the request info is higher than the highest txid in the actual batch to be written. In this case, we know that this batch has already been fsynced to a quorum of nodes, so we can skip the fsync() on the laggy node, helping it to catch back up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3870) QJM: add metrics to JournalNode
[ https://issues.apache.org/jira/browse/HDFS-3870?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3870: -- Attachment: hdfs-3870.txt > QJM: add metrics to JournalNode > --- > > Key: HDFS-3870 > URL: https://issues.apache.org/jira/browse/HDFS-3870 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: QuorumJournalManager (HDFS-3077) >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Attachments: hdfs-3870.txt > > > The JournalNode should expose some basic metrics through the usual interface. > In particular: > - the writer epoch, accepted epoch, > - the last written transaction ID and last committed txid (which may be newer > in case that it's in the process of catching up) > - latency information for how long the syncs are taking > Please feel free to suggest others that come to mind. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HDFS-3884) QJM: Journal format() should reset cached values
[ https://issues.apache.org/jira/browse/HDFS-3884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Todd Lipcon updated HDFS-3884: -- Attachment: hdfs-3884.txt > QJM: Journal format() should reset cached values > > > Key: HDFS-3884 > URL: https://issues.apache.org/jira/browse/HDFS-3884 > Project: Hadoop HDFS > Issue Type: Sub-task >Affects Versions: QuorumJournalManager (HDFS-3077) >Reporter: Todd Lipcon >Assignee: Todd Lipcon > Fix For: QuorumJournalManager (HDFS-3077) > > Attachments: hdfs-3884.txt > > > Simple bug in the JournalNode: it caches certain values (eg accepted epoch) > in memory, and the cached values aren't reset when the journal is formatted. > So, after a format, further calls to the same Journal will see the old value > for accepted epoch, writer epoch, etc, preventing the journal from being > re-used until the JN is restarted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HDFS-3884) QJM: Journal format() should reset cached values
Todd Lipcon created HDFS-3884: - Summary: QJM: Journal format() should reset cached values Key: HDFS-3884 URL: https://issues.apache.org/jira/browse/HDFS-3884 Project: Hadoop HDFS Issue Type: Sub-task Affects Versions: QuorumJournalManager (HDFS-3077) Reporter: Todd Lipcon Assignee: Todd Lipcon Fix For: QuorumJournalManager (HDFS-3077) Attachments: hdfs-3884.txt Simple bug in the JournalNode: it caches certain values (eg accepted epoch) in memory, and the cached values aren't reset when the journal is formatted. So, after a format, further calls to the same Journal will see the old value for accepted epoch, writer epoch, etc, preventing the journal from being re-used until the JN is restarted. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira